Soma-notes - User contributions [en]

COMP 3000 Essay 2 2010 Question 11

2010-12-02T19:04:30Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [9]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [10]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [11]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping: either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate to the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, otherwise known as drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when they are resumed from a suspended state, or restored from a snapshot. It is easy to reason that, because the guest's clock starts off correctly, it will always be correct from then on. Unfortunately, this is not the case. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time, the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that it is treated like a process by the host. In and of itself this does not seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it is easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and generally does this with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [6]
For demanding applications, this is neither robust nor reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduce the offset error. Unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. Thus, it would not work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[7] Such clocks are unsuitable for applications that deal with network management and measurements. The reason for this is that NTP focuses on the offset and not on hardware clock's oscillator rate. For example, when calculating delay variations, the offset error does not change anything in the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved but when it is enabled again on the new system problems occur: because no two hardware clocks drift the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This consequences could prove disastrous to the system. These could range anywhere from slowly recoverable errors to ones where the ntpd might never recover and where the virtualized OS could become unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked environments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. Broomhead, Cremean, Ridoux, Veitch. "Virtualize Everything But Time" . 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. Web. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf Accessed: Nov. 2010.

3. "Bran's Kernel Development Tutorial" from ''Bona Fide OS Developer'' . Web. http://www.osdever.net/bkerndev/Docs/pit.htm . Accessed: Nov. 2010.

4. "What is a CMOS battery, and why does my computer need one?" ''Indiana University's Knowledge Base'', 2010. Web. http://kb.iu.edu/data/adoy.html . Accessed: Nov. 2010.

5. "Multiprocessor Specification version 1.4". ''Intel''. May 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf .

6. Pasztor and Veitch. "PC Based Precision Timing Without GPS". 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf .

7. Veitch, Ridoux, Korada. "Robust Synchronization of Absolute and Difference Clocks over Networks". 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

9. "Advanced Configuration and Power Interface Specification". ''Intel Corporation''. April 2010. http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf.

10. "Game timing and Multicore Processors". ''msdn'' . Dec 2005. Web. http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx . Accessed Nov. 2010.

11. "IA-PC HPET (High Precision Event Timers) Specification. ''Intel Corporation''. Oct. 2004. http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf .

COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:53:46Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [9]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [10]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [11]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping: either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate to the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, otherwise known as drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when they are resumed from a suspended state, or restored from a snapshot. It is easy to reason that, because the guest's clock starts off correctly, it will always be correct from then on. Unfortunately, this is not the case. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time, the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that it is treated like a process by the host. In and of itself this does not seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it is easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and generally does this with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [6]
For demanding applications, this is neither robust nor reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduce the offset error. Unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. Thus, it would not work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[7] Such clocks are unsuitable for applications that deal with network management and measurements. The reason for this is that NTP focuses on the offset and not on hardware clock's oscillator rate. For example, when calculating delay variations, the offset error does not change anything in the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved but when it is enabled again on the new system problems occur: because no two hardware clocks drift the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This consequences could prove disastrous to the system. These could range anywhere from slowly recoverable errors to ones where the ntpd might never recover and where the virtualized OS might become unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked enviroments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4. "What is a CMOS battery, and why does my computer need one?" from the Indiana University's Knowledge Base, 2010. http://kb.iu.edu/data/adoy.html

5. "Multiprocessor Specification version 1.4" from Intel, 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf

6. Pasztor and Veitch. "PC Based Precision Timing Without GPS". 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

7. Veitch, Ridoux, Korada. "Robust Synchronization of Absolute and Difference Clocks over Networks". 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

9. "Advanced Configuration and Power Interface Specification". ''Intel Corporation''. April 2010. http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf. Accessed Nov. 2010.

10. "Game timing and Multiecore Processors". ''msdn'' . Dec 2005. http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx . Accessed Nov. 2010.

11. "IA-PC HPET (High Precision Event Timers) Specification. ''Intel Corporation''. Dec. 2004. http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf . Accessed Nov. 2010.

COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:46:37Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [9]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [10]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [11]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping: either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate to the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, otherwise known as drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when they are resumed from a suspended state, or restored from a snapshot. It is easy to reason that, because the guest's clock starts off correctly, it will always be correct from then on. Unfortunately, this is not the case. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time, the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that it is treated like a process by the host. In and of itself this does not seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it is easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and generally does this with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [6]
For demanding applications, this is neither robust nor reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduce the offset error. Unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. Thus, it would not work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[7]
So for applications that deals with network managements and measurements, this is unsuitable. Why? Because NTP focus on offset and not on hardware clock oscillator rate. For example, when calculating delay variations, the offset error doesn’t change anything to the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved and when it is enabled again on the new system, thats where the problems starts. Because no two hardware clocks drifts the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This could prove disastrous to the system. This could go from a slowly recoverable error to one where ntpd might never recover, making the virtualized OS unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked enviroments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4. "What is a CMOS battery, and why does my computer need one?" from the Indiana University's Knowledge Base, 2010. http://kb.iu.edu/data/adoy.html

5. "Multiprocessor Specification version 1.4" from Intel, 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf

6. Pasztor and Veitch. "PC Based Precision Timing Without GPS". 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

7. Veitch, Ridoux, Korada. "Robust Synchronization of Absolute and Difference Clocks over Networks". 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

9. "Advanced Configuration and Power Interface Specification". ''Intel Corporation''. April 2010. http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf. Accessed Nov. 2010.

10. "Game timing and Multiecore Processors". ''msdn'' . Dec 2005. http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx . Accessed Nov. 2010.

11. "IA-PC HPET (High Precision Event Timers) Specification. ''Intel Corporation''. Dec. 2004. http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf . Accessed Nov. 2010.

Talk:COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:34:12Z

Filitche:

The guest time-keeping section is really good but requires citations. Does someone know where exactly the info came from? - Fedor

Hi, I'm making some cosmetic changes to style, grammar&citation-format. - Fedor

--[[User:Spanke|Spanke]] 00:19, 2 December 2010 (UTC) Finished Timers, I hate 3004...

--[[User:Jjpwilso|Jjpwilso]] 00:03, 2 December 2010 (UTC) I'll check for some more references on the ACM and IEEE databases. In the meantime I thought I'd mention what Anil said regarding critique. He suggested we should consider other approaches to the same solution, such as modifying NTP with a different heuristic. I'll see what I can dig up in other papers on NTP.

--[[User:ScottG|ScottG]] 22:06, 1 December 2010 (UTC) I'm assuming you meant for me to add my references, yes? I really only used the article, and 'Timekeeping in Virtual Machines' which I went to add, but is already on there. I've looked for other articles to try to get how others have looked at it that aren't VMware, but there really isn't a huge amount out there dealing ''specifically'' with guest timekeeping (unless I've gone Google-blind, which has admittedly happened before). Mostly I ran into links pointing to that specific article.

--[[User:Sblais2|Sblais2]] 17:50, 1 December 2010 (UTC) I added stuf into the Research problems. I think I summarized most of them. If I forgot any, please add them in. I also added the missing references in the reference section. For Fedor, we seem to miss some content in 2 sections. Also, you could read through the other section and add/change some pertinent information that might've been missed that would make this essay even better.

--[[User:Sblais2|Sblais2]] 15:11, 1 December 2010 (UTC) Would it be possible to add your references at the bottom please? Even if it is a link. I have added the article link at the top of the essay.

Hey guys, sorry its taken me a while to post here. If there is a particular topic that needs researching, I could spend some hours doing that tomorrow - suggestions? Also, I intend to fix up the style&structure after everything is done as I am quite good with that.

Fedor

--[[User:ScottG|ScottG]] 21:36, 26 November 2010 (UTC) So I was a little (more than a little) behind on my initially estimated time for getting stuff up on Guest Timekeeping, but that's the gist of it there now. I'm going to try to buff it up a bit before it's due, since what I put in is a bit rougher than I'd like. If I seem to be missing something that should be pretty obvious, let me know and I'll work it in.

--[[User:Jjpwilso|Jjpwilso]] 15:49, 23 November 2010 (UTC) I've been completely swamped with COMP3004 stuff (among other things) and feeling guilty as hell about this essay. The good news, for those who might have missed today's lecture, is we have an extension of one week. Phew!!

--[[User:Sblais2|Sblais2]] 21:29, 22 November 2010 (UTC) I have added a small part to the background section. I have created by hand a diagram explaining how it works. I tried to find an original way of doing it but it is the same diagram everywhere. Please feel free to comment here or by sending me an email.

--[[User:AbsMechanik|AbsMechanik]] 19:46, 22 November 2010 (UTC) Here's what my research has led me to so far. I'm trying to come up with good points for the research problem, contribution and critique part of this essay. Here's a bunch of links, I've come across. I think there will be a few more tonight. Feel free to read through 'em:
 http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf
 http://www.xen.org/files/xen_interface.pdf
 http://www.microsoft.com/whdc/system/sysinternals/mm-timer.mspx
 http://www.intel.com/hardwaredesign/hpetspec_1.pdf
 http://www.cubinlab.ee.unimelb.edu.au/radclock/

--[[User:ScottG|ScottG]] 18:55, 22 November 2010 (UTC) I'm good taking the Guest Timekeeping section. Hopefully I'll have some stuff up tonight or early tomorrow for it.

--[[User:Sblais2|Sblais2]] 17:14, 22 November 2010 (UTC) I will be working on the Background section. I will dedicate it to explain some of the key concepts that are used in the research paper that will allow the readers to have a better understanding on the rest of our essay. The structure you've put in place looks good but it might get modified, depending on the text will flow. The diagram is a good idea. I will drawn a simple one and add it in. Feel free again to critique.

--[[User:Jjpwilso|Jjpwilso]] 15:12, 16 November 2010 (UTC) I wanted to get a structure started, so I have stubbed out the first section. Note: some of the sub-sections might belong in the Research Problem section but we can easily move them if they fit there. Let's use this area to plan who is doing what. Feel free to critique any of my submissions. When you comment here, please put your comments at the very top so we can easily see recent posts.

=Participants=
(X) Blais Sylvain sblais2 - Email: syl20blais@gmail.com 
(X) Graham Scott sgraham6 
(X) Ilitchev Fedor filitche fedor dot ilitchev at gmail dot com 
(X) Panke Shane spanke 
(X) Shukla Abhinav ashukla2 
(X) Wilson Robert jjpwilso

COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:33:02Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [9]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [10]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [11]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping: either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate to the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, otherwise known as drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when they are resumed from a suspended state, or restored from a snapshot. It is easy to reason that, because the guest's clock starts off correctly, it will always be correct from then on. Unfortunately, this is not the case. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time, the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that it is treated like a process by the host. In and of itself this does not seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it is easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and it generally does so with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf]
For demanding applications, this is neither robust or reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduced the offset error but unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. So this won’t work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf]
So for applications that deals with network managements and measurements, this is unsuitable. Why? Because NTP focus on offset and not on hardware clock oscillator rate. For example, when calculating delay variations, the offset error doesn’t change anything to the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved and when it is enabled again on the new system, thats where the problems starts. Because no two hardware clocks drifts the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This could prove disastrous to the system. This could go from a slowly recoverable error to one where ntpd might never recover, making the virtualized OS unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked enviroments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4. "What is a CMOS battery, and why does my computer need one?" from the Indiana University's Knowledge Base, 2010. http://kb.iu.edu/data/adoy.html

5. "Multiprocessor Specification version 1.4" from Intel, 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf

6. "PC Based Precision Timing Without GPS" by Attila Pa ́sztor and Darryl Veitch, 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

7. "Robust Synchronization of Absolute and Difference Clocks over Networks" by Darryl Veitch, Julien Ridoux and Satish Babu Korada, 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

9. "Advanced Configuration and Power Interface Specification". ''Intel Corporation''. April 2010. http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf. Accessed Nov. 2010.

10. "Game timing and Multiecore Processors". ''msdn'' . Dec 2005. http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx . Accessed Nov. 2010.

11. "IA-PC HPET (High Precision Event Timers) Specification. ''Intel Corporation''. Dec. 2004. http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf . Accessed Nov. 2010.

Talk:COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:24:33Z

Filitche:

Hi, I'm making some cosmetic changes to style, grammar&citation-format. - Fedor

--[[User:Spanke|Spanke]] 00:19, 2 December 2010 (UTC) Finished Timers, I hate 3004...

--[[User:Jjpwilso|Jjpwilso]] 00:03, 2 December 2010 (UTC) I'll check for some more references on the ACM and IEEE databases. In the meantime I thought I'd mention what Anil said regarding critique. He suggested we should consider other approaches to the same solution, such as modifying NTP with a different heuristic. I'll see what I can dig up in other papers on NTP.

--[[User:ScottG|ScottG]] 22:06, 1 December 2010 (UTC) I'm assuming you meant for me to add my references, yes? I really only used the article, and 'Timekeeping in Virtual Machines' which I went to add, but is already on there. I've looked for other articles to try to get how others have looked at it that aren't VMware, but there really isn't a huge amount out there dealing ''specifically'' with guest timekeeping (unless I've gone Google-blind, which has admittedly happened before). Mostly I ran into links pointing to that specific article.

--[[User:Sblais2|Sblais2]] 17:50, 1 December 2010 (UTC) I added stuf into the Research problems. I think I summarized most of them. If I forgot any, please add them in. I also added the missing references in the reference section. For Fedor, we seem to miss some content in 2 sections. Also, you could read through the other section and add/change some pertinent information that might've been missed that would make this essay even better.

--[[User:Sblais2|Sblais2]] 15:11, 1 December 2010 (UTC) Would it be possible to add your references at the bottom please? Even if it is a link. I have added the article link at the top of the essay.

Hey guys, sorry its taken me a while to post here. If there is a particular topic that needs researching, I could spend some hours doing that tomorrow - suggestions? Also, I intend to fix up the style&structure after everything is done as I am quite good with that.

Fedor

--[[User:ScottG|ScottG]] 21:36, 26 November 2010 (UTC) So I was a little (more than a little) behind on my initially estimated time for getting stuff up on Guest Timekeeping, but that's the gist of it there now. I'm going to try to buff it up a bit before it's due, since what I put in is a bit rougher than I'd like. If I seem to be missing something that should be pretty obvious, let me know and I'll work it in.

--[[User:Jjpwilso|Jjpwilso]] 15:49, 23 November 2010 (UTC) I've been completely swamped with COMP3004 stuff (among other things) and feeling guilty as hell about this essay. The good news, for those who might have missed today's lecture, is we have an extension of one week. Phew!!

--[[User:Sblais2|Sblais2]] 21:29, 22 November 2010 (UTC) I have added a small part to the background section. I have created by hand a diagram explaining how it works. I tried to find an original way of doing it but it is the same diagram everywhere. Please feel free to comment here or by sending me an email.

--[[User:AbsMechanik|AbsMechanik]] 19:46, 22 November 2010 (UTC) Here's what my research has led me to so far. I'm trying to come up with good points for the research problem, contribution and critique part of this essay. Here's a bunch of links, I've come across. I think there will be a few more tonight. Feel free to read through 'em:
 http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf
 http://www.xen.org/files/xen_interface.pdf
 http://www.microsoft.com/whdc/system/sysinternals/mm-timer.mspx
 http://www.intel.com/hardwaredesign/hpetspec_1.pdf
 http://www.cubinlab.ee.unimelb.edu.au/radclock/

--[[User:ScottG|ScottG]] 18:55, 22 November 2010 (UTC) I'm good taking the Guest Timekeeping section. Hopefully I'll have some stuff up tonight or early tomorrow for it.

--[[User:Sblais2|Sblais2]] 17:14, 22 November 2010 (UTC) I will be working on the Background section. I will dedicate it to explain some of the key concepts that are used in the research paper that will allow the readers to have a better understanding on the rest of our essay. The structure you've put in place looks good but it might get modified, depending on the text will flow. The diagram is a good idea. I will drawn a simple one and add it in. Feel free again to critique.

--[[User:Jjpwilso|Jjpwilso]] 15:12, 16 November 2010 (UTC) I wanted to get a structure started, so I have stubbed out the first section. Note: some of the sub-sections might belong in the Research Problem section but we can easily move them if they fit there. Let's use this area to plan who is doing what. Feel free to critique any of my submissions. When you comment here, please put your comments at the very top so we can easily see recent posts.

=Participants=
(X) Blais Sylvain sblais2 - Email: syl20blais@gmail.com 
(X) Graham Scott sgraham6 
(X) Ilitchev Fedor filitche fedor dot ilitchev at gmail dot com 
(X) Panke Shane spanke 
(X) Shukla Abhinav ashukla2 
(X) Wilson Robert jjpwilso

COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:23:37Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [9]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping, using either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate with the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, or more simply called drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when it is resumed from a suspended state, or restored from a snapshot – so it is easy to think that, since it starts off correctly the guest's clock will continue to be correct. That is, of course, incorrect. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that the it is treated like a process by the host. In and of itself this doesn't seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it's easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and it generally does so with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf]
For demanding applications, this is neither robust or reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduced the offset error but unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. So this won’t work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf]
So for applications that deals with network managements and measurements, this is unsuitable. Why? Because NTP focus on offset and not on hardware clock oscillator rate. For example, when calculating delay variations, the offset error doesn’t change anything to the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved and when it is enabled again on the new system, thats where the problems starts. Because no two hardware clocks drifts the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This could prove disastrous to the system. This could go from a slowly recoverable error to one where ntpd might never recover, making the virtualized OS unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked enviroments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4. "What is a CMOS battery, and why does my computer need one?" from the Indiana University's Knowledge Base, 2010. http://kb.iu.edu/data/adoy.html

5. "Multiprocessor Specification version 1.4" from Intel, 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf

6. "PC Based Precision Timing Without GPS" by Attila Pa ́sztor and Darryl Veitch, 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

7. "Robust Synchronization of Absolute and Difference Clocks over Networks" by Darryl Veitch, Julien Ridoux and Satish Babu Korada, 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

9. "Advanced Configuration and Power Interface Specification". ''Intel Corporation''. April 2010. http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf. Accessed Nov. 2010.

10. "Game timing and Multiecore Processors". ''msdn'' . Dec 2005. http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx . Accessed Nov. 2010.

COMP 3000 Essay 2 2010 Question 11

2010-12-02T18:08:14Z

Filitche:

= Virtualize Everything But Time =
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf]

=Background Concepts=

The next time you notice one stranger ask another for the time and you see them check their watch, try this experiment: immediately ask too. Chances are the person will check their watch again. Why? Human internal clocks are notoriously unreliable. Our sense of time contracts and expands all day long. We seem to believe that a definitive report of time can only come from some mechanical or electronic source. So social norms require that the watch owner provides you with two things: 1) the time, and 2) a gesture of external authority, i.e. a glance at their watch.

The story of time inside a virtual machine is almost as unreliable as our own internal clocks. How much time has elapsed since a VM client got the CPU's attention? At the best of times there's no way for it to guess because it wasn't actually running the whole time. If the VM was suspended and migrated from one physical host to another its concept of time is even worse. This paper is about how a computer glances at its metaphorical watch, and what kinds of timepieces it has at hand.

To better understand this paper, it is very important to have a good understanding of the general concepts behind it. For example, we all know what clocks are in our day-to-day lives but how are they different in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, explore the advantages and disadvantages of the different available counters and synchronization algorithms, and explain what a para-virtualized system is about.

===Timekeeping===

Computers typically measure time in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up a hardware device, generally a CPU, to interrupt at a certain rate. A counter is updated each time one of these interrupts occurs. It is this counter that allows the system to keep track of the passage of time.

In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead. This device has its own counter which starts when the system is booted. The OS simply reads the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts, however its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift (see below). But neither of these methods knows the actual time, they only know how long it has been since they last checked an authoritative source. Personal computers typically get their time from a battery-backed real-time clock (i.e. a CMOS clock). Networked machines often need a more precise time, with a resolution in the millisecond range or below. In these cases a machine can query another source such as one based on Network Time Protocol (NTP).

===Clocks===

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

[[File:Timerabstract.jpg]]

This diagram nicely represents how tick counting works. The oscillator runs at a predetermined frequency. The operating system might have to measure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. Not all hardware timers work exactly like that. For instance, some actually count up, others don't use interrupts, and yet others don't keep an initial counter. The general principle of hardware counters is the however the same. There is some kind of fixed interval at the end of which the current time is updated by an appropriate number of units (i.e. nanoseconds).

===Timers===
# PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [http://www.osdever.net/bkerndev/Docs/pit.htm]
# CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [http://kb.iu.edu/data/adoy.html]
# Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [http://developer.intel.com/design/pentium/datashts/24201606.pdf]
# ACPI establishes industry-standard interfaces configuration guided by the OS and power management. It is industry-standard through its creators, Intel, Microsoft, Phoenix, Hewlett Packard and Toshiba. Its power management includes all forms: notebooks, desktops, and servers. ACPI's goal is to improve current power and configuration standards for hardware devices by transitioning to ACPI-compliant hardware. This allows the OS as well as the VM to have control over power management. [http://www.intel.com/technology/iapc/acpi/][http://www.acpi.info/][http://www.acpi.info/DOWNLOADS/ACPIspec40a.pdf]
# RDTSC is based on the x86 P5 instruction set and perform high-resolution timing, however, it suffers from several flaws. Discontinuous values from the processor are caused as a result of not using the same thread to the processor each time, which can also be caused by having a multicore processor. This is made worse by ACPI which will eventually lead to the cores being completely out of sync. Availability of dedicated hardware: "RDTSC locks the timing information that the application requests to the processor's cycle counter." With dedicated timing devices included on modern motherboards this method of locking the timing information will become obsolete. Lastly, the variability of the CPU's frequency needs to be taken into account. With modern day laptops, most CPU frequencies are adjusted on the fly to respond to the users demand when needed and to lower themselves when idle, this results in longer battery life and less heat generated by the laptop but regretfully affects RDTSC making it unreliable. [http://msdn.microsoft.com/en-us/library/ee417693%28VS.85%29.aspx]
# HPET defines a set of timers that the OS has access to and can assign to applications. Each timer can generate an interrupt when the least significant bits are equal to the equivalent bits of the 64-bit counter value. However, a race case can occur in which the target time has already passed. This causes more interrupts and more work even if the task is a simple one. It does produce less interrupts than its predecessors PIT and CMOS RTC giving it an edge. Despite its race condition, this modern timer is improvement upon old practices. [http://hackipedia.org/Hardware/HPET,%20High%20Performance%20Event%20Timer/IA-PC%20HPET%20%28High%20Precision%20Event%20Timers%29%20Specification.pdf]

==Guest Timekeeping==

Guest timekeeping is done using the same general methods as any computer timekeeping, using either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate with the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, or more simply called drifting.

===Sources of Drift===

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when it is resumed from a suspended state, or restored from a snapshot – so it is easy to think that, since it starts off correctly the guest's clock will continue to be correct. That is, of course, incorrect. The first source of drift is simply due to electronics. A clock is almost never entirely accurate, having a slight error due to the effects of ambient temperature on oscillator frequency, even on the host system. Since the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that the it is treated like a process by the host. In and of itself this doesn't seem like a problem, but the guest system can be suspended just before it tries to update its perception of time. With restricted CPU time, it's easy for the lost ticks to pile up and create a backlog. Even if the guest is checking over the network with a time server, its network conversation can be suspended before the answer comes back. When the process resumes the guest has a wildly inaccurate perception of the elapsed time (known as Round-Trip Time or RTT) and its incorrect adjustments for the network delay will throw off the clock. Problems can also come from memory swaps performed by the host. If the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Clearly the errors in virtual machine timekeeping come from algorithms that were simply not designed for virtual environments. They assume a more stable physical world than they have.

===Impact of Drift===

The sources of drift essentially boil down to round-off errors and lost ticks. But the practical impact of drift is quite apparent in any automated system. For a real-world analogy consider a factory's assembly line, where the machinery is finely tuned to do its own specific part at certain intervals, and it generally does so with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt.

In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

The impacts of drift are even more apparent when virtual machines must communicate with one another. Consider the case of a farm of distributed gaming servers, where differences in timing can mean virtual characters suddenly warp at super-human speed across the landscape. Or for a more serious example consider investment trading, where missing the moment to bid can mean a difference of millions of dollars.

===Compensation Strategies===

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

=Research problem=

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf]
For demanding applications, this is neither robust or reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduced the offset error but unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. So this won’t work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf]
So for applications that deals with network managements and measurements, this is unsuitable. Why? Because NTP focus on offset and not on hardware clock oscillator rate. For example, when calculating delay variations, the offset error doesn’t change anything to the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved and when it is enabled again on the new system, thats where the problems starts. Because no two hardware clocks drifts the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This could prove disastrous to the system. This could go from a slowly recoverable error to one where ntpd might never recover, making the virtualized OS unstable.

=Contribution=

The authors of this timekeeping paper have done previous work exploring the feed-forward and feedback mechanisms for clock algorithm adjustment in non-virtual systems [7][8]. The RADclock algorithm (Robust Absolute Difference) was originally explored to address the drift resulting from NTP's feedback algorithm, and how non-ideal network conditions (a circumstance that is quite common) can have serious effects. In their original paper they improved network synchronization using the TimeStamp Counter (TSC) register a system call introduced in Pentium class machines as a source for a CPU cycle count. The use of a more reliable timestamp and counter provided "GPS-like" reliability in networked enviroments.

This new paper seeks to take a similar approach in a virtual machine setting where VM migration can cause much more severe disruption than simply lost UDP packets. Rather than use TSC calls (''rdtsc()'' in [8]) they tried several clock sources, seeking to eliminate variability from power management and CPU load when setting ''raw'' timestamps for use in guest machines.

The paper makes several references to ''feed-forward'' and ''feedback'' mechanisms, and so a quick discussion of these control theories is in order. In feed-forward mechanisms, inputs to a process or calculation may be modified in advance, but the resulting output plays no part in subsequent calculations. In feedback mechanisms (such as NTPd) inputs to a calculation can be modified by outputs from previous ones. As a result, feedback mechanisms carry state from one step to the next. Since the state of a virtual environment may be rendered inaccurate by so many sources a feedback mechanism as a bad idea. The statelessness of feed-forward mechanisms offers VM migrations a particular advantage since after migration a guest OS can simply discover the actual facts of timestamps rather than try to estimate them from their own invalid state information.

The mechanism the authors used in the Xen environment was the XenStore, a filesystem structure like ''sysfs'' or ''procfs'' that permits communications between virtual domains using shared memory. Dom0 (Xen terminology for the hypervisor) takes sychronization from its own time server and saves the calculated clock parameters to the shared XenStore. On DomU (Xen for any guest OS) an application would simply use the shared information as a raw timestamp or a time difference. Their extensive testing showed a clear -- if not surprising -- advantage over NTPd.

=Critique=

It is difficult to critique the authors' work since they did a great job of finding a meaningful set of timestamps and counters and clearly demonstrated an advantage in their field of study. They also compared a number of time sources to ensure that their selection was meaningful and stable. And it's hard to argue with success. Their results approximate the variation that comes from CPU temperature. That's quite impressive.

As a student, one criticism might be that they found quite an obscure way of describing what was at heart a very simple problem. Of course, to be fair to these academics, their paper wasn't written for students. But the problem can be summed up succinctly: If you can't trust your own perception of time, ask the closest agent you ''can'' trust -- and make sure they've check their watch. There is no closer source to a VM than its host, so find the fastest way there.

=References=

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4. "What is a CMOS battery, and why does my computer need one?" from the Indiana University's Knowledge Base, 2010. http://kb.iu.edu/data/adoy.html

5. "Multiprocessor Specification version 1.4" from Intel, 1997. http://developer.intel.com/design/pentium/datashts/24201606.pdf

6. "PC Based Precision Timing Without GPS" by Attila Pa ́sztor and Darryl Veitch, 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

7. "Robust Synchronization of Absolute and Difference Clocks over Networks" by Darryl Veitch, Julien Ridoux and Satish Babu Korada, 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf

8. Broomhead, T.; Ridoux, J.; Veitch, D.; , "Counter availability and characteristics for feed-forward based synchronization," Precision Clock Synchronization for Measurement, Control and Communication, 2009. ISPCS 2009. International Symposium on , vol., no., pp.1-6, 12-16 Oct. 2009

Talk:COMP 3000 Essay 2 2010 Question 11

2010-12-01T04:05:37Z

Filitche:

Hey guys, sorry its taken me a while to post here. If there is a particular topic that needs researching, I could spend some hours doing that tomorrow - suggestions? Also, I intend to fix up the style&structure after everything is done as I am quite good with that.

Fedor

--[[User:ScottG|ScottG]] 21:36, 26 November 2010 (UTC) So I was a little (more than a little) behind on my initially estimated time for getting stuff up on Guest Timekeeping, but that's the gist of it there now. I'm going to try to buff it up a bit before it's due, since what I put in is a bit rougher than I'd like. If I seem to be missing something that should be pretty obvious, let me know and I'll work it in.

--[[User:Jjpwilso|Jjpwilso]] 15:49, 23 November 2010 (UTC) I've been completely swamped with COMP3004 stuff (among other things) and feeling guilty as hell about this essay. The good news, for those who might have missed today's lecture, is we have an extension of one week. Phew!!

--[[User:Sblais2|Sblais2]] 21:29, 22 November 2010 (UTC) I have added a small part to the background section. I have created by hand a diagram explaining how it works. I tried to find an original way of doing it but it is the same diagram everywhere. Please feel free to comment here or by sending me an email.

--[[User:AbsMechanik|AbsMechanik]] 19:46, 22 November 2010 (UTC) Here's what my research has led me to so far. I'm trying to come up with good points for the research problem, contribution and critique part of this essay. Here's a bunch of links, I've come across. I think there will be a few more tonight. Feel free to read through 'em:
 http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf
 http://www.xen.org/files/xen_interface.pdf
 http://www.microsoft.com/whdc/system/sysinternals/mm-timer.mspx
 http://www.intel.com/hardwaredesign/hpetspec_1.pdf
 http://www.cubinlab.ee.unimelb.edu.au/radclock/

--[[User:ScottG|ScottG]] 18:55, 22 November 2010 (UTC) I'm good taking the Guest Timekeeping section. Hopefully I'll have some stuff up tonight or early tomorrow for it.

--[[User:Sblais2|Sblais2]] 17:14, 22 November 2010 (UTC) I will be working on the Background section. I will dedicate it to explain some of the key concepts that are used in the research paper that will allow the readers to have a better understanding on the rest of our essay. The structure you've put in place looks good but it might get modified, depending on the text will flow. The diagram is a good idea. I will drawn a simple one and add it in. Feel free again to critique.

--[[User:Jjpwilso|Jjpwilso]] 15:12, 16 November 2010 (UTC) I wanted to get a structure started, so I have stubbed out the first section. Note: some of the sub-sections might belong in the Research Problem section but we can easily move them if they fit there. Let's use this area to plan who is doing what. Feel free to critique any of my submissions. When you comment here, please put your comments at the very top so we can easily see recent posts.

=Participants=
(X) Blais Sylvain sblais2 - Email: syl20blais@gmail.com 
(X) Graham Scott sgraham6 
(X) Ilitchev Fedor filitche fedor dot ilitchev at gmail dot com 
(X) Panke Shane spanke 
(X) Shukla Abhinav ashukla2 
(X) Wilson Robert jjpwilso

Talk:COMP 3000 Essay 2 2010 Question 11

2010-11-15T18:45:17Z

Filitche:

Please mark an X if you are able to participate.

(X) Blais Sylvain sblais2 
(X) Graham Scott sgraham6 
(X) Ilitchev Fedor filitche fedor dot ilitchev at gmail dot com 
(X) Panke Shane spanke 
(X) Shukla Abhinav ashukla2 
(X) Wilson Robert jjpwilso

Talk:COMP 3000 Essay 2 2010 Question 11

2010-11-15T18:44:45Z

Filitche:

Please mark an X if you are able to participate.

(X) Blais Sylvain sblais2 
(X) Graham Scott sgraham6 
(X) Ilitchev Fedor filitche fedor dot ilitchev at gmail dot com
(X) Panke Shane spanke 
(X) Shukla Abhinav ashukla2 
(X) Wilson Robert jjpwilso

COMP 3000 Essay 1 2010 Question 10

2010-10-15T16:49:43Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

Though flash memory on the whole has strengths and weaknesses, it is also divided into two sub-types and each of these also has its own advantages and disadvantages that shall be briefly examined here. The two types of flash memory are NOR and NAND: NOR has the faster read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and can last about ten times longer. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2] Although these two types of flash are different, they are not sufficiently so as to merit an independence file system approach for each one. Thus, we shall not differentiate between them for the rest of this article.

The use of flash has grown dramatically in recent years, primarily for the purpose of portable devices and portable data storage. While this phenomenon has been partially due to the falling price of the product, it has been more of a result of flash's strengths - these make it a near optimal solution for the many portable devices which are currently in vogue. These strengths, besides compactness, resistance to shock and independence from power-supplies also include extremely fast read times. Indeed, the read times of flash devices can be as much as fourteen times those of hard disk drives. [17]

Despite these advantages, however, are not the preferred storage method for home PCs. The reason for this is not just that other kinds of memory are still significantly cheaper, but also that flash memory has certain limitations built into it. The most critical of these is that each block of flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write or erase a file on flash than to a hard disk drive (HDD).[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

An HDD file-system, however, is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit. [SOURCE?]

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. [3] This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources. It is also useful because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also has advantages when it comes to wear leveling. These advantages mostly come from the use of banks and the garbage collector. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used (ie. which one has the fewest timestamps). Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. These advantages are made actual through the log itself, the use of banks and through the garbage collector.

==Conclusion==

In this way, thanks to its special wear-leveling features, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T16:38:29Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

Though flash memory on the whole has strengths and weaknesses, it is also divided into two sub-types and each of these also has its own advantages and disadvantages that shall be briefly examined here. The two types of flash memory are NOR and NAND: NOR has the faster read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and can last about ten times longer. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2] Although these two types of flash are different, they are not sufficiently so as to merit an independence file system approach for each one. Thus, we shall not differentiate between them for the rest of this article.

The use of flash has grown dramatically in recent years, primarily for the purpose of portable devices and portable data storage. While this phenomenon has been partially due to the falling price of the product, it has been more of a result of flash's strengths - these make it a near optimal solution for the many portable devices which are currently in vogue. These strengths, besides compactness, resistance to shock and independence from power-supplies also include extremely fast read times. Indeed, the read times of flash devices can be as much as fourteen times those of hard disk drives. [17]

Despite these advantages, however, are not the preferred storage method for home PCs. The reason for this is not just that other kinds of memory are still significantly cheaper, but also that flash memory has certain limitations built into it. The most critical of these is that each block of flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write or erase a file on flash than to a hard disk drive (HDD).[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

An HDD file-system, however, is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit. [SOURCE?]

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. [3] This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources. It is also useful because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also has advantages when it comes to wear leveling. These are made actual through various technological innovations which include a translation table and banks. The translation table is a structure which maps each block of memory to a flag which keeps track of its state. When a block is being written to, the FTL marks its state as "allocated" so as to prevent other write-processes from targeting the same block. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becomes "pre-valid" If writing to this block involved the erasure from another block, the now old block is at marked as invalid (ready for erasure) and the new block is finally marked as valid after that.[8]

On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T16:20:04Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

Though flash memory on the whole has strengths and weaknesses, it is also divided into two sub-types and each of these also has its own advantages and disadvantages that shall be briefly examined here. The two types of flash memory are NOR and NAND: NOR has the faster read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and can last about ten times longer. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2] Although these two types of flash are different, they are not sufficiently so as to merit an independence file system approach for each one. Thus, we shall not differentiate between them for the rest of this article.

The use of flash has grown dramatically in recent years, primarily for the purpose of portable devices and portable data storage. While this phenomenon has been partially due to the falling price of the product, it has been more of a result of flash's strengths - these make it a near optimal solution for the many portable devices which are currently in vogue. These strengths, besides compactness, resistance to shock and independence from power-supplies also include extremely fast read times. Indeed, the read times of flash devices can be as much as fourteen times those of hard disk drives. [17]

Despite these advantages, however, are not the preferred storage method for home PCs. The reason for this is not just that other kinds of memory are still significantly cheaper, but also that flash memory has certain limitations built into it. The most critical of these is that each block of flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write or erase a file on flash than to a hard disk drive (HDD).[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

An HDD file-system, however, is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit. [SOURCE?]

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. [3] This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources. It is also useful because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also has advantages when it comes to wear leveling. These are made actual through various technological innovations which include a translation table and banks. The translation table is a structure which maps each block of memory to a flag which keeps track of its state. These states are "allocated",

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T16:15:22Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

Though flash memory on the whole has strengths and weaknesses, it is also divided into two sub-types and each of these also has its own advantages and disadvantages that shall be briefly examined here. The two types of flash memory are NOR and NAND: NOR has the faster read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and can last about ten times longer. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2] Although these two types of flash are different, they are not sufficiently so as to merit an independence file system approach for each one. Thus, we shall not differentiate between them for the rest of this article.

The use of flash has grown dramatically in recent years, primarily for the purpose of portable devices and portable data storage. While this phenomenon has been partially due to the falling price of the product, it has been more of a result of flash's strengths - these make it a near optimal solution for the many portable devices which are currently in vogue. These strengths, besides compactness, resistance to shock and independence from power-supplies also include extremely fast read times. Indeed, the read times of flash devices can be as much as fourteen times those of hard disk drives. [17]

Despite these advantages, however, are not the preferred storage method for home PCs. The reason for this is not just that other kinds of memory are still significantly cheaper, but also that flash memory has certain limitations built into it. The most critical of these is that each block of flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write or erase a file on flash than to a hard disk drive (HDD).[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

An HDD file-system, however, is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit. [SOURCE?]

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. [3] This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources. It is also useful because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also has advantages when it comes to wear leveling. These are made actual through var

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T16:05:38Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

Though flash memory on the whole has strengths and weaknesses, it is also divided into two sub-types and each of these also has its own advantages and disadvantages that shall be briefly examined here. The two types of flash memory are NOR and NAND: NOR has the faster read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and can last about ten times longer. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2] Although these two types of flash are different, they are not sufficiently so as to merit an independence file system approach for each one. Thus, we shall not differentiate between them for the rest of this article.

The use of flash has grown dramatically in recent years, primarily for the purpose of portable devices and portable data storage. While this phenomenon has been partially due to the falling price of the product, it has been more of a result of flash's strengths - these make it a near optimal solution for the many portable devices which are currently in vogue. These strengths, besides compactness, resistance to shock and independence from power-supplies also include extremely fast read times. Indeed, the read times of flash devices can be as much as fourteen times those of hard disk drives. [17]

Despite these advantages, however, are not the preferred storage method for home PCs. The reason for this is not just that other kinds of memory are still significantly cheaper, but also that flash memory has certain limitations built into it. The most critical of these is that each block of flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write or erase a file on flash than to a hard disk drive (HDD).[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:21:11Z

Filitche:

Hey all,

I think we should write down our emails here so we can further discuss stuff without having to login here.
('''***Note that discussions over email can't be counted towards your participation grade!***'''--[[User:Soma|Anil]])

Geoff Smith (gsmith0413@gmail.com) - gsmith6

Andrew Bujáki (abujaki [at] Connect or Live.ca)
***I'm usually on MSN(Live) for collaboration at nights, Just make sure to put in a little message about who you are when you're adding me. :)

I used Google Scholar and came to this page http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#
Which briefly touches on the issues of Flash memory. Specifically, inability to update in place, and limited write/erase cycles.

Inability to update in place could refer to the way the flash disk is programmed, instead of bit-by-bit, it is programmed block-by-block. A block would have to be erased and completely reprogrammed in order to flip one bit after it's been set.
http://en.wikipedia.org/wiki/Flash_memory#Block_erasure

Limited write/erase: Flash memory typically has a short lifespan if it's being used a lot. Writing and erasing the memory (Changing, updating, etc) Will wear it out. Flash memory has a finite amount of writes, (varying on manufacturer, models, etc), and once they've been used up, you'll get bad sectors, corrupt data, and generally be SOL.
http://en.wikipedia.org/wiki/Flash_memory#Memory_wear

Filesystems would have to be changed to play nicely with these constraints, where it must use blocks efficiently and nicely, and minimize writing/erasing as much as possible.

I found a paper that talks about the performance, capabilities and limitations of NAND flash storage.

Abstract: "This presentation provides an in-depth examination of the
fundamental theoretical performance, capabilities, and
limitations of NAND Flash-based Solid State Storage (SSS). The
tutorial will explore the raw performance capabilities of NAND
Flash, and limitations to performance imposed by mitigation of
reliability issues, interfaces, protocols, and technology types.
Best practices for system integration of SSS will be discussed.
Performance achievements will be reviewed for various
products and applications. "

Link: http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf

There's no Starting place like Wikipedia, even if you shouldn't source it.

http://en.wikipedia.org/wiki/Flash_Memory

http://en.wikipedia.org/wiki/LogFS

http://en.wikipedia.org/wiki/Hard_disk

http://en.wikipedia.org/wiki/Wear_leveling

http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29

http://en.wikipedia.org/wiki/Solid-state_drive

Hey Guys,

We really don't have much time to get this done. Lets meet tomorrow after class and get our bearings to do this properly.

Fedor

A few of us have Networking immediately after class. I know personally I won't be able to make anything set on Tuesday.
Additionally, he spoke briefly about hotspots on the disk for our question last week, where places on the disk would be written to far more often than others.
As well, for bibliographical citing, http://bibme.org is a wonderful resource for the popular formats (I.e. MLA). If it should come down to that.
~Andrew

===links===

Start Posting some stuff to source from:

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1
--"Introduction to flash memory"

http://portal.acm.org/citation.cfm?id=1244248
--"Wear Leveling" (it's about a proposed way of doing it, but explains a whole bunch of other things to do that)

http://portal.acm.org/citation.cfm?id=1731355
--"Online maintenance of very large random samples on flash storage" (ie dealing with the constraints of Flash Storage in a system that might actually be written to 100000 times)

http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf
--"An Efficient NAND Flash File System for Flash Memory Storage" discuses shortcomings of using hard disk based file systems and current flash based file systems

http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf
--"NAND vs NOR Flash Memory" (note: i didn't get this off of Google scholar but it seems to be written by someone from Toshiba. is that ok?)

Hi everybody,

So here are the latest news. Geoff, Andrew and myself had a meeting after class today and came up with a plan for writing this thing.

We decided to have 3 parts:

1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?)
[don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.]

Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say.
With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts.
~Source: Years in the 'biz

Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips.
source: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1 , http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf

2. How a traditional disk-based file-system works and why the limitations of flash storage make the two a poor match
[the obvious answer seems to be that traditional file-systems could just write to whatever memory was available but if they did this with a flash file-systems, certain chunks of memory would become unusable before others and the memory would be more difficult to work with. Also, disk-based file systems need to deal with seeking times which means that they want to organize their data in such a way as to reduce those (by putting related things together?) - with Flash, this isn't really a problem and thus one constraint the less to be concerned with.]

3. How a log based file-system works and why this method of operation is so well suited to working with flash memory especially in light of the latter's inherent limitations
[...]

At this time, the plan is that Geoff will work on #3 today, Andrew will work on #1 tomorrow and I will work on #2 tomorrow. The three of us will make an effort to consult some somewhat more painfully technical literature in order to gain insight into our respective queries. Whatever insight we find will be posted here.

Then, we will meet again on Thursday after class to decide how to actually write the essay.

PS, if there is anybody in the group besides the three of us - let us know so you can find a way to contribute to this... as at least two of us are competent essayists, painfully technical research would on one or more of the above topics would be a great way to contribute... especially if you could post it here prior to one of us going over the same thing.

Fedor

-- I'm not that great (but absolutely horrid) at essays and I'm alright at research, but if nothing else I have Thursday off and nothing (else) that needs doing by Friday so I can probably spend a bunch of time working on it just before it's due. -- ''Nick L''

-- Hay sorry I was unable to attend the meeting after class today. I am not too good at writing essays as well but I am pretty good at summarizing and researching. I am not too sure at what you would like me to do. Right now I'll assume you need me to research/summarizing articles for the 3 topics above. If you need me to do anything else post it here. I'll be checking the discussion regularly until this due. once again sorry for missing the meeting-- Paul Cox.

-- Hey i'm also supposed to be in on this. Sorry i couldn't contribute sooner because i was playing catchup in my other classes. Let me know what i can do and i'll be on it asap. - kirill (k.kashigin@gmail.com)
update: i'm gonna be helping Fedor with #2

PS, this article http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw looks really useful for part 3.

---same article as above but shorter link: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142

PPS, and this article looks really great for understanding how log based file systems work: http://delivery.acm.org/10.1145/150000/146943/p26-rosenblum.pdf?key1=146943&key2=3656986821&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973

Hey Luc (TA) here, Anandtech ran a series of articles on solid state drives that you guys might find useful. It mostly looked at hardware aspects but it gives some interesting insights on how to modify file systems to better support flash memory.

http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3403

http://www.anandtech.com/storage/showdoc.aspx?i=3531&p=1

http://anandtech.com/storage/showdoc.aspx?i=3631

--[[User:3maisons|3maisons]] 19:44, 12 October 2010 (UTC)

Hey Paul&Kirill,

If one of you guys could help me out with #2, that would be really great. I was going to work on that tomorrow, but I also have another large assignment to deal with and not having to do this research would greatly ease my life. Moreover, I do intend to work on writing&polishing the essay on Thursday as I have a lot of experience with that and it far more than research. Let me know if either one of you can help me with this.

The other person could probably read over what Luc posted for us and see if it fits into our framework. Just be sure to state who is going to do what.

Nick,

Honestly, we really hope to have the research done by Thursday. If that is the only day that you are free and you're not a writer, I'm honestly not sure what you could do. Perhaps someone else can think of something.

- Fedor

I'm gonna have something for #2 up tonight. -kirill

So I found this article on Reddit, posted from Linux Weekly News on pretty much exactly what we are looking at. It's entitled "Solid-state storage devices and the block layer"

http://lwn.net/SubscriberLink/408428/68fa8465da45967a/ --[[User:Gsmith6|Gsmith6]] 20:36, 13 October 2010 (UTC)

I wasn't exactly sure how much information i was supposed to present but here's what i got for #2:

Most conventional file systems are designed to me implemented on hard disk drives. This fact does not mean they cannot be implemented on a solid state drive (file storage that uses flash memory instead of magnetic discs). It would however, in many ways, defeat the purpose of using flash memory. The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically so there is no need to waste resources laying out the data in close proximity.
A traditional HDD file system will also attempt to defragment itself, moving blocks of data around for closer proximity on the magnetic disk. This process, although beneficial for HDD's, is harmful and inefficient for flash based storage. A flash optimal file system needs to reduce the amount of erase operations, since flash memory only has a limited amount of erase cycles as well as having very slow erase speeds.
When an HDD rewrites data to a physical location there is no need for it to erase the previously occupying data first, so a traditional disk based file system doesn't worry about erasing data from unused memory blocks. In contrast flash memory needs to first erase the data block before it can modify any of it contents. Since the erase procedure is extremely slow, its not practical to overwrite old data every time. It is also decremental to the life span of flash memory.
To maximize the potential of flash based memory the file system would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to erase unused blocks when the system is idle, which does not get implemented in conventional file systems since it is not needed.

--kirill

So Fedor and I were talking in the labs, and we came to the conclusion that we have been focusing on just the translation from a regular file system to a flash drive. We were under the impression that this was in fact the "Flash Optimized System", but pulling up some more articles, I'm finding that this is not necessarily the case.

This paper here http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf.
shows an example from Axis Communications where they developed a file system specifically designed to be used on flash drives.

Now I haven't completely read it, so it might just be an optimized translational system, but its at least a start.

At our meeting today, we've decided that it would be best if people could post a rough summary of their notes in the appropriate sections, and I will rewrite them into an essay, which Fedor will go through later tonight to edit and add some more information.

Paul: I missed your comment that you weren't that great at writing, and good at research. If you want some articles behind the pay-walls, I've saved a bunch of them and emailed them to myself. Just email me (at the address at the top of the page) and I'll be more than happy to send some your way.

PS. some more references

Design tradeoffs for SSD performance http://portal.acm.org/citation.cfm?id=1404014.1404019
A log buffer-based flash translation layer using fully-associative sector translation http://delivery.acm.org/10.1145/1280000/1275990/a18-lee.pdf?key1=1275990&key2=0709607821&coll=GUIDE&dl=GUIDE&CFID=105787273&CFTOKEN=74601780

--[[User:Gsmith6|Gsmith6]] 15:03, 14 October 2010 (UTC)

I don't have any notes on this computer. >: I will be adding more to my section later on tonight. Sorry. ~Andrew

Hello dudes,

Just a quick note, try to include citations in your paragraphs - each time that you make a claim which came from evidence, put a little number [X, pp. page-number (if applicable)] into your text. Then, put the same [X] at the bottom of the page with the bibliographical information about the source. The prof hasn't yet gotten back to me about his preferred citation format, so just stick with this one for now:

Authors. ''Title''. Web-page. Date of article. Web (the word). Date you accessed it.

Here's an example:

[1] Kawaguchi, Nishioka, Tamoda. ''A Flash Memory Based File System''.
http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw . 1995. Web. Oct. 14, 2010.

Fedor

PS, its a good idea to check this fairly frequently between now and tomorrow morning - you never know when something will come up.

Phew... for a while there I was starting to think that I had nothing about the actual "Log-Based System", but it turns out that the "Transitional Layer" is the same thing. It looks like some articles are calling it the Log system, while others are calling it the transitional layer. Pretty sure I'm going to have an experts knowledge about flash drives after reading all these articles :P
--[[User:Gsmith6|Gsmith6]] 18:13, 14 October 2010 (UTC)

Hay Geoff this is what i got so far after reading a couple of the pdfs. the double tabed points are just my annotation on how they relate to the question.

Ware leveling: p1126-chang.pdf

* Uneven wearing of flash memory due to storing data close together
* Garbage collection prefers that no blocks have pages that have data that is constantly becoming invalid
* data that remains the same for longs periods of time should be moved from block that have not be written to much and moved to blocks that haven been erased frequently.

Log file structure: 926-rosenblum.pdf

* LFS based on assumption that frequently read files will be stored in cash and that the hard disk traffic will be dominated by writes
* Writes all new info to disk in a sequential structure called a log
* Data is stored permanently in these logs no other data is stored on the hard drive
* Converts many small random synchronous writes to a large asynchronous sequential write
** Good for flash because it cuts down on writing (prolongs drive life)
** It also good because it writes to a bigger section then a page. This means it can fill a block at a time so it doesn’t fill up other blocks with random writes that would later need to be cleaned. Cuts down on cleaning.
* Inode is stored in the log on the disk while an inode map is maintained in memory which points to the inode in the hard disk. as
** This is good for flash drives because reading does not hurt the drives life and it is fast.
** This means the map will not have to be updated on the disk as frequently cutting down on the writes.
* Log systems weakness is that it is susceptible to becoming fragmented due to the larger writes.
** Since flash drives do not require fragmentation this is fine. Also since flash drives have very fast random access the system does not become bogged down when the logs are fragmented
* Log system implements a cleaning system that scans a segment in and sees if there is live data in it. If there is a certain percentage of invalid data which goes according to the cleaning policy it will be cleaned. All the live data will be copied out and the segment will be erased.
** This is a garbage collector but its built into the file system.
* Segments contain a number of blocks. Segments can contain logs or parts of logs.

I'm still sifting though the other pdfs you sent me. Would you like me to post more info or should i format this into a paragraph or 2?

--Paul

I'm sorting through my notes and putting them into a digital format. Difficult because It's hard to remember what I copied directly out of the notes for my own reference, so I'm trying to de-plagarize. I'll upload what I have to the section by 20:30, but expect more updates. ~Andrew

Hey Paul,

Thanks for helping out with the research
I'm thinking that we probably have enough information (at least for the log-based system), so maybe if you could "try" and putting them into paragraphs. I'm kind of in the middle of typing up the essay, and I'm uploading new information every 10-20 minutes or so. Even if you could get something into paragraph form, I can go through and edit, rearrange, and add stuff as I get there. --[[User:Gsmith6|Gsmith6]] 23:51, 14 October 2010 (UTC)

I'm just going to dump my entire wiki so far into my profile page, because everything's being updated so quickly.
[[User:Abujaki]] ~Andrew

I'm sitting at my computer working on much less immediately important homework if someone wants me to do something. I'm not a great writer but if there's something that needs doing I can do it. -- Nick [[User:Nlessard|Nlessard]] 00:55, 15 October 2010 (UTC)

Hey guys,

Keep up the good work!

Geoff, it would be good if you uploaded the stuff you were typing as you got each section finished. I rewrote part 2 and the intro some hours ago and I guess I could go out and write a conclusion while stuff is still being updated.

The other thing is that the current version of part 1 doesn't really seem to serve the argument - I mean, it doesn't actually tell us why flash has the constraints that it does on a technical level... and that's not surprising because that information seems to be hard to find. Still, I'm going to see if I can find out anything about that, though, so far, I haven't had much luck.

Apart from this, I'll try to get up early tomorrow to look over what's there: make sure the argument works, etc.

Good luck with the writing,

Fedor

I think it has to do with how some transistors are made. Give me a second to research that... ~Andrew

From source 20, page 269 sec 4.6.4
Stress Induced Oxide leakage
...These devices are either erased ... orboth programmed and erased by a high field (Fowler-Nordheim) injection of electrons into a very thin dielectric film to charge and discharge the floating gate. Unfortunately, [passing] this current through the oxide degrades the quality of the oxide and eventually leads to breakdown. The wearout of tunnel oxide films during high-field stress has been correlated with the buildup of both positive or negative trapped charges."... there are still some things missing from this explanation... Give me a couple more seconds... and my brain won't process this for some reason. =_= What can you guys make of this? ~Andrew

And I said NOR Flash memory had fast fetch times. I found what I was looking for, just not too sure how to explain it in text. Silicon Oxide is used for electron injection methods, including the Fowler-Nordheim Tunneling method, where it seems to say that an electric current is passed across the oxide and it thins enough for a current to be passed through a barrier into the oxide conduction... band?

Kay. New resource [19], the massively complicated theories and science would put anyone to sleep, but the basic point is The Silicon oxide layer will degrade in quality to the point where charge trapping begins to occur, the more it's used, and the more electrons that tunnel through it over time. It's like a kneaded eraser scenario. The more you use it, the poorer quality it gets until it gets all flakey and cannot do its job anymore. The same happens with the SiO2 layer in the floating gate transistors ~Andrew

hmmm... alright, I think I understand that, could you maybe try your best and add it into the essay? If not I can try and whip something up to put in there. thanks --[[User:Gsmith6|Gsmith6]] 02:39, 15 October 2010 (UTC)

Wow, I'm sorry I made you go into that... that does explain the limited erase-cycles, though. Was it the use of these materials that made flash innovative? How does that explain the block-sized erasures?

Going to go write a conclusion now.

Fedor

While looking for what you requested, I found an entire section on the oxide breakdown... oops. :P
So really nothing changes, As electrons tunnel through the oxide layer, it deteriorates. When the damage becomes too great, the oxide abruptly loses its insulative properties. (damn firefox, that's a word) Looking for the erasures now ~Andrew

Hey guys,

I'm off to sleep, but I'll be up at 5am to see what else I can do.

Fedor

Aw, I just missed you.
It turns out that erasing entire blocks is necessary based on how the chip is built. For both NOR and NAND, pairs of word lines share an erase line, which is located near the floating gate. It uses FN tunneling to erase two lines of words at a time, making the sector the smallest size you can erase simply because that's how it's physically wired.
i.e []-mem byte |-erase line

#[]|[][]|[]
#[]|[][]|[]
#[]|[][]|[]
#[]|[][]|[] ... If that makes any sense... The erase line is separate from the byte select or word select lines

Numbers for formatting sake.

I'm prolly gonna take Fedor's advice and head to bed, but I'll likely pop back on before I go to sleep in case anyone needs me to look up anything else. :P

--[[User:Lmundt|Lmundt]] 03:23, 15 October 2010 (UTC) Interesting article compaing speeds
https://noc.sara.nl/nrg/publications/RoN2010-D1.1.pdf

Wow, I can't believe I completely forgot. Hybrid Drives... I can't remember for the life of me whether they exist yet or not, but there will eventually exist a hard drive that's both flash and traditional platter. It allocates files that are not rewritten often, like an OS (well except windows and all its lovely updates) to the flash memory on install, and the remainder of the files (Documents, programs etc) to the traditional platter. Because you can't reformat an OS 100,000 times (no matter HOW badly you screw it up with viruses and the like), the flash memory will retain usability, and all the hotspots are on the platters, so you don't need to worry about minimizing disk usage.
I'm out for the night. Cheers all! ~Andrew

Alright guys... I've done as much as I can for tonight. I'm exhausted and I'm having troubles concentrating anymore. I added a few questions for people to answer as thehttp://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_1_2010_Question_10y read. I'm going to bed, hopefully people will be up early enough to fix some things that I may have missed because I sure as hell won't be :P. Really wish I could keep going though. --[[User:Gsmith6|Gsmith6]] 07:38, 15 October 2010 (UTC)

Hey y'all,

So, I'm just rewriting part 3 as we speak, trying to understand the invalid/pre-valid stuff - what is the purpose of this with relation to what we are talking about? Does it achieve even writing, or just keeps the same block from being written to by two processes at once? Also, is an invalidated block one that has been written to too many times?

Fedor

Moving right along, I'm still re-writing P3 and posting as I go. Some things are not clear to me, however, I have written those in [BRACKETED CAPITALS], if any of your are online, it would be useful if you could post disambiguations here.

Fedor

OK, I've gone over part 3. Certain things still don't necessarily make perfect and consistent sense. If one of you who has a solid understanding of of log-systems and how they work could go over it and see if what I've written is consistent with your understanding, that would be great.

Fedor

OK, I emailed Anil and he said we could take a bit of extra time (couple of hours?) to get the essay to its best possible form.

Here are the things I still find problematic about it:

- as I said, I am not positive about the data in part 3... particularly the invalid/prevalid stuff. As I said, I'd like for someone who understands this stuff to read what I've written and tell me if any connections have been omitted or misrepresented.

- part 1 is a messy collection of data. My original idea had been for it to explain what was innovative about flash design and how that lead to its particular advantages and disadvantages... but I don't think that we've found this information... I am pretty sure it has to do with the materials and the cirquit: the materials give a shorter life-span, but do they confer an advantage?... the circuit structure is what allows for for the fast access and compactness but is also what forces block-sized erasures. Right?

- we could use a final sentence... but unless it is both inspired and organically related to what the paper talks about, its better to leave it out.

Fedor

PS, I took away the bracketed caps for cosmetics but my concerns remain.

...rewriting part 3 again to make it consistent with Paul's research... part 1 is still a shambles.
Fedor

...I've changed the thesis to accommodate the part 1 more or less as is - should be better now. Still looking for feedback vis-a-vis part 3. -fdr

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:19:08Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to implement an file system to work with it.

There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hot-spots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:18:38Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to impliment an file system to work with it.

There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:10:18Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to impliment an file system to work with it.

Originally conceived as an improved alternative to EPROMs[7], Flash memory is non-volatile(power-supply independent) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

1. Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
2. Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
3. Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?
4. What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:09:19Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==

Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to impliment an file system to work with it.

Originally conceived as an improved alternative to EPROMs[7], Flash memory is non-volatile(power-supply independent) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:09:04Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=

First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems. The particular limitations that lead to this necessity are flash-memory’s limited number of erase-cycles, the slowness of its writes and erasures, as well as its need to conduct erasures one entire block at a time. A typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes and wear-leveling, as well as its efficiency through the better management of writes and erasures.

==Flash Memory==Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to impliment an file system to work with it.

Originally conceived as an improved alternative to EPROMs[7], Flash memory is non-volatile(power-supply independent) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T14:01:08Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes, wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==Though Flash memory has many advantages it also has several key disadvantages which are of key relevance for anyone attempting to impliment an file system to work with it.

Originally conceived as an improved alternative to EPROMs[7], Flash memory is non-volatile(power-supply independent) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy. This means that NAND is the superior form of flash for long term storage, while NOR is preferred for computer embedded read-only functions. [2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

The extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown to be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives offer exponentially faster read-speeds than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times (or just 10000 if its NOR flash).[14] This poses a problem because of flash memory's second significant drawback: when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

Thus, though flash does offer much faster read-times, it comes with the heavy constraints on a limited number of erase cycles, slow write and erase times and the necessity to be erased a block at a time.

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T13:42:32Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through the log-structure of writes, wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T13:41:39Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write. This is useful for flash systems because it allows them to time the expensive write operation in such a way as to not waste resources and also because, by filling up full blocks at at time, the log-structured file system makes sure that space is used to its fullest capacity. This last fact in turn ensures that it will take the memory a comparatively long time to be filled and that erasures and over-writes will only occur when they absolutely have to. As has been discussed, this is of great benefit to flash memory.

A log structured file system also keeps a log of how many times each block of data has been erased and this is good for conducting wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system arise from its log-structure but also have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T13:27:28Z

Filitche:

COMP 3000 Essay 1 2010 Question 10

2010-10-15T13:26:24Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not share the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write.

This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. Wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.

On another technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T13:23:31Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log that keeps the data which must still be written to disk and then writes it all in one big write.

This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.

On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T12:03:38Z

Filitche:

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T12:03:01Z

Filitche:

Hey all,

I think we should write down our emails here so we can further discuss stuff without having to login here.
('''***Note that discussions over email can't be counted towards your participation grade!***'''--[[User:Soma|Anil]])

Geoff Smith (gsmith0413@gmail.com) - gsmith6

Andrew Bujáki (abujaki [at] Connect or Live.ca)
***I'm usually on MSN(Live) for collaboration at nights, Just make sure to put in a little message about who you are when you're adding me. :)

I used Google Scholar and came to this page http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#
Which briefly touches on the issues of Flash memory. Specifically, inability to update in place, and limited write/erase cycles.

Inability to update in place could refer to the way the flash disk is programmed, instead of bit-by-bit, it is programmed block-by-block. A block would have to be erased and completely reprogrammed in order to flip one bit after it's been set.
http://en.wikipedia.org/wiki/Flash_memory#Block_erasure

Limited write/erase: Flash memory typically has a short lifespan if it's being used a lot. Writing and erasing the memory (Changing, updating, etc) Will wear it out. Flash memory has a finite amount of writes, (varying on manufacturer, models, etc), and once they've been used up, you'll get bad sectors, corrupt data, and generally be SOL.
http://en.wikipedia.org/wiki/Flash_memory#Memory_wear

Filesystems would have to be changed to play nicely with these constraints, where it must use blocks efficiently and nicely, and minimize writing/erasing as much as possible.

I found a paper that talks about the performance, capabilities and limitations of NAND flash storage.

Abstract: "This presentation provides an in-depth examination of the
fundamental theoretical performance, capabilities, and
limitations of NAND Flash-based Solid State Storage (SSS). The
tutorial will explore the raw performance capabilities of NAND
Flash, and limitations to performance imposed by mitigation of
reliability issues, interfaces, protocols, and technology types.
Best practices for system integration of SSS will be discussed.
Performance achievements will be reviewed for various
products and applications. "

Link: http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf

There's no Starting place like Wikipedia, even if you shouldn't source it.

http://en.wikipedia.org/wiki/Flash_Memory

http://en.wikipedia.org/wiki/LogFS

http://en.wikipedia.org/wiki/Hard_disk

http://en.wikipedia.org/wiki/Wear_leveling

http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29

http://en.wikipedia.org/wiki/Solid-state_drive

Hey Guys,

We really don't have much time to get this done. Lets meet tomorrow after class and get our bearings to do this properly.

Fedor

A few of us have Networking immediately after class. I know personally I won't be able to make anything set on Tuesday.
Additionally, he spoke briefly about hotspots on the disk for our question last week, where places on the disk would be written to far more often than others.
As well, for bibliographical citing, http://bibme.org is a wonderful resource for the popular formats (I.e. MLA). If it should come down to that.
~Andrew

===links===

Start Posting some stuff to source from:

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1
--"Introduction to flash memory"

http://portal.acm.org/citation.cfm?id=1244248
--"Wear Leveling" (it's about a proposed way of doing it, but explains a whole bunch of other things to do that)

http://portal.acm.org/citation.cfm?id=1731355
--"Online maintenance of very large random samples on flash storage" (ie dealing with the constraints of Flash Storage in a system that might actually be written to 100000 times)

http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf
--"An Efficient NAND Flash File System for Flash Memory Storage" discuses shortcomings of using hard disk based file systems and current flash based file systems

http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf
--"NAND vs NOR Flash Memory" (note: i didn't get this off of Google scholar but it seems to be written by someone from Toshiba. is that ok?)

Hi everybody,

So here are the latest news. Geoff, Andrew and myself had a meeting after class today and came up with a plan for writing this thing.

We decided to have 3 parts:

1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?)
[don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.]

Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say.
With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts.
~Source: Years in the 'biz

Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips.
source: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1 , http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf

2. How a traditional disk-based file-system works and why the limitations of flash storage make the two a poor match
[the obvious answer seems to be that traditional file-systems could just write to whatever memory was available but if they did this with a flash file-systems, certain chunks of memory would become unusable before others and the memory would be more difficult to work with. Also, disk-based file systems need to deal with seeking times which means that they want to organize their data in such a way as to reduce those (by putting related things together?) - with Flash, this isn't really a problem and thus one constraint the less to be concerned with.]

3. How a log based file-system works and why this method of operation is so well suited to working with flash memory especially in light of the latter's inherent limitations
[...]

At this time, the plan is that Geoff will work on #3 today, Andrew will work on #1 tomorrow and I will work on #2 tomorrow. The three of us will make an effort to consult some somewhat more painfully technical literature in order to gain insight into our respective queries. Whatever insight we find will be posted here.

Then, we will meet again on Thursday after class to decide how to actually write the essay.

PS, if there is anybody in the group besides the three of us - let us know so you can find a way to contribute to this... as at least two of us are competent essayists, painfully technical research would on one or more of the above topics would be a great way to contribute... especially if you could post it here prior to one of us going over the same thing.

Fedor

-- I'm not that great (but absolutely horrid) at essays and I'm alright at research, but if nothing else I have Thursday off and nothing (else) that needs doing by Friday so I can probably spend a bunch of time working on it just before it's due. -- ''Nick L''

-- Hay sorry I was unable to attend the meeting after class today. I am not too good at writing essays as well but I am pretty good at summarizing and researching. I am not too sure at what you would like me to do. Right now I'll assume you need me to research/summarizing articles for the 3 topics above. If you need me to do anything else post it here. I'll be checking the discussion regularly until this due. once again sorry for missing the meeting-- Paul Cox.

-- Hey i'm also supposed to be in on this. Sorry i couldn't contribute sooner because i was playing catchup in my other classes. Let me know what i can do and i'll be on it asap. - kirill (k.kashigin@gmail.com)
update: i'm gonna be helping Fedor with #2

PS, this article http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw looks really useful for part 3.

---same article as above but shorter link: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142

PPS, and this article looks really great for understanding how log based file systems work: http://delivery.acm.org/10.1145/150000/146943/p26-rosenblum.pdf?key1=146943&key2=3656986821&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973

Hey Luc (TA) here, Anandtech ran a series of articles on solid state drives that you guys might find useful. It mostly looked at hardware aspects but it gives some interesting insights on how to modify file systems to better support flash memory.

http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3403

http://www.anandtech.com/storage/showdoc.aspx?i=3531&p=1

http://anandtech.com/storage/showdoc.aspx?i=3631

--[[User:3maisons|3maisons]] 19:44, 12 October 2010 (UTC)

Hey Paul&Kirill,

If one of you guys could help me out with #2, that would be really great. I was going to work on that tomorrow, but I also have another large assignment to deal with and not having to do this research would greatly ease my life. Moreover, I do intend to work on writing&polishing the essay on Thursday as I have a lot of experience with that and it far more than research. Let me know if either one of you can help me with this.

The other person could probably read over what Luc posted for us and see if it fits into our framework. Just be sure to state who is going to do what.

Nick,

Honestly, we really hope to have the research done by Thursday. If that is the only day that you are free and you're not a writer, I'm honestly not sure what you could do. Perhaps someone else can think of something.

- Fedor

I'm gonna have something for #2 up tonight. -kirill

So I found this article on Reddit, posted from Linux Weekly News on pretty much exactly what we are looking at. It's entitled "Solid-state storage devices and the block layer"

http://lwn.net/SubscriberLink/408428/68fa8465da45967a/ --[[User:Gsmith6|Gsmith6]] 20:36, 13 October 2010 (UTC)

I wasn't exactly sure how much information i was supposed to present but here's what i got for #2:

Most conventional file systems are designed to me implemented on hard disk drives. This fact does not mean they cannot be implemented on a solid state drive (file storage that uses flash memory instead of magnetic discs). It would however, in many ways, defeat the purpose of using flash memory. The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically so there is no need to waste resources laying out the data in close proximity.
A traditional HDD file system will also attempt to defragment itself, moving blocks of data around for closer proximity on the magnetic disk. This process, although beneficial for HDD's, is harmful and inefficient for flash based storage. A flash optimal file system needs to reduce the amount of erase operations, since flash memory only has a limited amount of erase cycles as well as having very slow erase speeds.
When an HDD rewrites data to a physical location there is no need for it to erase the previously occupying data first, so a traditional disk based file system doesn't worry about erasing data from unused memory blocks. In contrast flash memory needs to first erase the data block before it can modify any of it contents. Since the erase procedure is extremely slow, its not practical to overwrite old data every time. It is also decremental to the life span of flash memory.
To maximize the potential of flash based memory the file system would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to erase unused blocks when the system is idle, which does not get implemented in conventional file systems since it is not needed.

--kirill

So Fedor and I were talking in the labs, and we came to the conclusion that we have been focusing on just the translation from a regular file system to a flash drive. We were under the impression that this was in fact the "Flash Optimized System", but pulling up some more articles, I'm finding that this is not necessarily the case.

This paper here http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf.
shows an example from Axis Communications where they developed a file system specifically designed to be used on flash drives.

Now I haven't completely read it, so it might just be an optimized translational system, but its at least a start.

At our meeting today, we've decided that it would be best if people could post a rough summary of their notes in the appropriate sections, and I will rewrite them into an essay, which Fedor will go through later tonight to edit and add some more information.

Paul: I missed your comment that you weren't that great at writing, and good at research. If you want some articles behind the pay-walls, I've saved a bunch of them and emailed them to myself. Just email me (at the address at the top of the page) and I'll be more than happy to send some your way.

PS. some more references

Design tradeoffs for SSD performance http://portal.acm.org/citation.cfm?id=1404014.1404019
A log buffer-based flash translation layer using fully-associative sector translation http://delivery.acm.org/10.1145/1280000/1275990/a18-lee.pdf?key1=1275990&key2=0709607821&coll=GUIDE&dl=GUIDE&CFID=105787273&CFTOKEN=74601780

--[[User:Gsmith6|Gsmith6]] 15:03, 14 October 2010 (UTC)

I don't have any notes on this computer. >: I will be adding more to my section later on tonight. Sorry. ~Andrew

Hello dudes,

Just a quick note, try to include citations in your paragraphs - each time that you make a claim which came from evidence, put a little number [X, pp. page-number (if applicable)] into your text. Then, put the same [X] at the bottom of the page with the bibliographical information about the source. The prof hasn't yet gotten back to me about his preferred citation format, so just stick with this one for now:

Authors. ''Title''. Web-page. Date of article. Web (the word). Date you accessed it.

Here's an example:

[1] Kawaguchi, Nishioka, Tamoda. ''A Flash Memory Based File System''.
http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw . 1995. Web. Oct. 14, 2010.

Fedor

PS, its a good idea to check this fairly frequently between now and tomorrow morning - you never know when something will come up.

Phew... for a while there I was starting to think that I had nothing about the actual "Log-Based System", but it turns out that the "Transitional Layer" is the same thing. It looks like some articles are calling it the Log system, while others are calling it the transitional layer. Pretty sure I'm going to have an experts knowledge about flash drives after reading all these articles :P
--[[User:Gsmith6|Gsmith6]] 18:13, 14 October 2010 (UTC)

Hay Geoff this is what i got so far after reading a couple of the pdfs. the double tabed points are just my annotation on how they relate to the question.

Ware leveling: p1126-chang.pdf

* Uneven wearing of flash memory due to storing data close together
* Garbage collection prefers that no blocks have pages that have data that is constantly becoming invalid
* data that remains the same for longs periods of time should be moved from block that have not be written to much and moved to blocks that haven been erased frequently.

Log file structure: 926-rosenblum.pdf

* LFS based on assumption that frequently read files will be stored in cash and that the hard disk traffic will be dominated by writes
* Writes all new info to disk in a sequential structure called a log
* Data is stored permanently in these logs no other data is stored on the hard drive
* Converts many small random synchronous writes to a large asynchronous sequential write
** Good for flash because it cuts down on writing (prolongs drive life)
** It also good because it writes to a bigger section then a page. This means it can fill a block at a time so it doesn’t fill up other blocks with random writes that would later need to be cleaned. Cuts down on cleaning.
* Inode is stored in the log on the disk while an inode map is maintained in memory which points to the inode in the hard disk. as
** This is good for flash drives because reading does not hurt the drives life and it is fast.
** This means the map will not have to be updated on the disk as frequently cutting down on the writes.
* Log systems weakness is that it is susceptible to becoming fragmented due to the larger writes.
** Since flash drives do not require fragmentation this is fine. Also since flash drives have very fast random access the system does not become bogged down when the logs are fragmented
* Log system implements a cleaning system that scans a segment in and sees if there is live data in it. If there is a certain percentage of invalid data which goes according to the cleaning policy it will be cleaned. All the live data will be copied out and the segment will be erased.
** This is a garbage collector but its built into the file system.
* Segments contain a number of blocks. Segments can contain logs or parts of logs.

I'm still sifting though the other pdfs you sent me. Would you like me to post more info or should i format this into a paragraph or 2?

--Paul

I'm sorting through my notes and putting them into a digital format. Difficult because It's hard to remember what I copied directly out of the notes for my own reference, so I'm trying to de-plagarize. I'll upload what I have to the section by 20:30, but expect more updates. ~Andrew

Hey Paul,

Thanks for helping out with the research
I'm thinking that we probably have enough information (at least for the log-based system), so maybe if you could "try" and putting them into paragraphs. I'm kind of in the middle of typing up the essay, and I'm uploading new information every 10-20 minutes or so. Even if you could get something into paragraph form, I can go through and edit, rearrange, and add stuff as I get there. --[[User:Gsmith6|Gsmith6]] 23:51, 14 October 2010 (UTC)

I'm just going to dump my entire wiki so far into my profile page, because everything's being updated so quickly.
[[User:Abujaki]] ~Andrew

I'm sitting at my computer working on much less immediately important homework if someone wants me to do something. I'm not a great writer but if there's something that needs doing I can do it. -- Nick [[User:Nlessard|Nlessard]] 00:55, 15 October 2010 (UTC)

Hey guys,

Keep up the good work!

Geoff, it would be good if you uploaded the stuff you were typing as you got each section finished. I rewrote part 2 and the intro some hours ago and I guess I could go out and write a conclusion while stuff is still being updated.

The other thing is that the current version of part 1 doesn't really seem to serve the argument - I mean, it doesn't actually tell us why flash has the constraints that it does on a technical level... and that's not surprising because that information seems to be hard to find. Still, I'm going to see if I can find out anything about that, though, so far, I haven't had much luck.

Apart from this, I'll try to get up early tomorrow to look over what's there: make sure the argument works, etc.

Good luck with the writing,

Fedor

I think it has to do with how some transistors are made. Give me a second to research that... ~Andrew

From source 20, page 269 sec 4.6.4
Stress Induced Oxide leakage
...These devices are either erased ... orboth programmed and erased by a high field (Fowler-Nordheim) injection of electrons into a very thin dielectric film to charge and discharge the floating gate. Unfortunately, [passing] this current through the oxide degrades the quality of the oxide and eventually leads to breakdown. The wearout of tunnel oxide films during high-field stress has been correlated with the buildup of both positive or negative trapped charges."... there are still some things missing from this explanation... Give me a couple more seconds... and my brain won't process this for some reason. =_= What can you guys make of this? ~Andrew

And I said NOR Flash memory had fast fetch times. I found what I was looking for, just not too sure how to explain it in text. Silicon Oxide is used for electron injection methods, including the Fowler-Nordheim Tunneling method, where it seems to say that an electric current is passed across the oxide and it thins enough for a current to be passed through a barrier into the oxide conduction... band?

Kay. New resource [19], the massively complicated theories and science would put anyone to sleep, but the basic point is The Silicon oxide layer will degrade in quality to the point where charge trapping begins to occur, the more it's used, and the more electrons that tunnel through it over time. It's like a kneaded eraser scenario. The more you use it, the poorer quality it gets until it gets all flakey and cannot do its job anymore. The same happens with the SiO2 layer in the floating gate transistors ~Andrew

hmmm... alright, I think I understand that, could you maybe try your best and add it into the essay? If not I can try and whip something up to put in there. thanks --[[User:Gsmith6|Gsmith6]] 02:39, 15 October 2010 (UTC)

Wow, I'm sorry I made you go into that... that does explain the limited erase-cycles, though. Was it the use of these materials that made flash innovative? How does that explain the block-sized erasures?

Going to go write a conclusion now.

Fedor

While looking for what you requested, I found an entire section on the oxide breakdown... oops. :P
So really nothing changes, As electrons tunnel through the oxide layer, it deteriorates. When the damage becomes too great, the oxide abruptly loses its insulative properties. (damn firefox, that's a word) Looking for the erasures now ~Andrew

Hey guys,

I'm off to sleep, but I'll be up at 5am to see what else I can do.

Fedor

Aw, I just missed you.
It turns out that erasing entire blocks is necessary based on how the chip is built. For both NOR and NAND, pairs of word lines share an erase line, which is located near the floating gate. It uses FN tunneling to erase two lines of words at a time, making the sector the smallest size you can erase simply because that's how it's physically wired.
i.e []-mem byte |-erase line

#[]|[][]|[]
#[]|[][]|[]
#[]|[][]|[]
#[]|[][]|[] ... If that makes any sense... The erase line is separate from the byte select or word select lines

Numbers for formatting sake.

I'm prolly gonna take Fedor's advice and head to bed, but I'll likely pop back on before I go to sleep in case anyone needs me to look up anything else. :P

--[[User:Lmundt|Lmundt]] 03:23, 15 October 2010 (UTC) Interesting article compaing speeds
https://noc.sara.nl/nrg/publications/RoN2010-D1.1.pdf

Wow, I can't believe I completely forgot. Hybrid Drives... I can't remember for the life of me whether they exist yet or not, but there will eventually exist a hard drive that's both flash and traditional platter. It allocates files that are not rewritten often, like an OS (well except windows and all its lovely updates) to the flash memory on install, and the remainder of the files (Documents, programs etc) to the traditional platter. Because you can't reformat an OS 100,000 times (no matter HOW badly you screw it up with viruses and the like), the flash memory will retain usability, and all the hotspots are on the platters, so you don't need to worry about minimizing disk usage.
I'm out for the night. Cheers all! ~Andrew

Alright guys... I've done as much as I can for tonight. I'm exhausted and I'm having troubles concentrating anymore. I added a few questions for people to answer as thehttp://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_1_2010_Question_10y read. I'm going to bed, hopefully people will be up early enough to fix some things that I may have missed because I sure as hell won't be :P. Really wish I could keep going though. --[[User:Gsmith6|Gsmith6]] 07:38, 15 October 2010 (UTC)

Hey y'all,

So, I'm just rewriting part 3 as we speak, trying to understand the invalid/pre-valid stuff - what is the purpose of this with relation to what we are talking about? Does it achieve even writing, or just keeps the same block from being written to by two processes at once? Also, is an invalidated block one that has been written to too many times?

Fedor

Moving right along, I'm still re-writing P3 and posting as I go. Some things are not clear to me, however, I have written those in [BRACKETED CAPITALS], if any of your are online, it would be useful if you could post disambiguations here.

Fedor

OK, I've gone over part 3. Certain things still don't necessarily make perfect and consistent sense. If one of you who has a solid understanding of of log-systems and how they work could go over it and see if what I've written is consistent with your understanding, that would be great.

Fedor

OK, I emailed Anil and he said we could take a bit of extra time (couple of hours?) to get the essay to its best possible form.

Here are the things I still find problematic about it:

- as I said, I am not positive about the data in part 3... particularly the invalid/prevalid stuff. As I said, I'd like for someone who understands this stuff to read what I've written and tell me if any connections have been omitted or misrepresented.
- part 1 is a messy collection of data. My original idea had been for it to explain what was innovative about flash design and how that lead to its particular advantages and disadvantages... but I don't think that we've found this information... I am pretty sure it has to do with the materials and the cirquit: the materials give a shorter life-span, but do they confer an advantage?... the circuit structure is what allows for for the fast access and compactness but is also what forces block-sized erasures. Right?
- we could use a final sentence... but unless it is both inspired and organically related to what the paper talks about, its better to leave it out.

Fedor

PS, I took away the bracketed caps for cosmetics but my concerns remain.

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:50:39Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.

On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:49:45Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are.

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:49:17Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid.

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:48:58Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:36:58Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its wear-leveling, banks and garbage collector, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:35:33Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to its innovative design. Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it is able to optimize a flash unit's lifespan through wear-leveling and to better manage writes and erasures through the use of such tools as banks and the garbage collector.

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:31:29Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its unique advantages as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==

It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7]

Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:27:31Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

# What particularities/deficiencies of flash memory does any file-system which implements it have to take into account? What are some ways of dealing with them?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:24:36Z

Filitche:

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:22:58Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T11:19:56Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===
At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a

garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old

segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes

invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY? ARE YOU CONFUSING BLOCKS AND BANKS?]. Due

to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===

Thus, the advantages of the log-based file system mostly have to do with wear leveling and the optimization of writes and erasures on the block level. The

former is accomplished by means of the log itself and its intelligent use when it comes to deciding where to write. The latter, through certain technical

tools such as allocation-technology, banks and, of course, the garbage collector.

===Systems Developed For Flash===
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory.
As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14]

When writing data to the drive, the nodes carrying that data get attached to the end of the log.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T10:47:35Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
On a technical level, the FTL also works with banks. A bank is essentially a group of sequential addresses which keeps track of when it was last updated using timestamps.When the FTL receives a request to write something to memory, it uses a list of these to determine which area of the drive should be used. Then, it only writes to that bank, until it is full before switching to another. Very importantly, the system also keeps a Cleaning Bank list which indicates which banks need to be emptied out. This feature makes certain that writes and erasures are never mixed in the same bank and is also integral to the whole erasure process, which is in turn one of the key feature of the log-based file system. [8]

===Cleaner===

At the heart of this process is the garbage collector: When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks. Furthermore, because the collector does not erase every block the moment it becomes invalidated but ony does it by the bank, time is saved when the expensive erase operation is called [WHY EXACTLY?]. Due to the slowness of the erasures, the kernel typically conducts the cleaning when the drive is idle and many resources are available for this process. [SOURCE]

===Why a Log File System is efficient for flash had drive===
The file system only writes a bank at a time. This means that the OS can save up the small random writes and write them all at the same time in a bank. This will cut down on the use of the expensive write command improving the overall performance of the hard drive.[9]

If a collision occurs when writing a new bank, the file system will sends the new data to an empty bank rather then erasing the existing bank and replacing it. This will cut down on using the erase function improving the life of the hard drive. Also since it only performs writes command and not a erase command and write command like a traditional file system this improves performance of the drive as well.[16]

===Systems Developed For Flash===
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory.
As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14]

When writing data to the drive, the nodes carrying that data get attached to the end of the log.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T10:45:49Z

Filitche:

COMP 3000 Essay 1 2010 Question 10

2010-10-15T10:28:32Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system. For the reasons given above, HDD systems are unsuitable for use with flash memory.

==Flash Optimized File Systems==

Log-based file systems, however, do not the disadvantages of HDD file systems. Also known as Flash Transitional Layers (FTL), their distinguishing feature is a log which keeps track of how many times each memory address has been erased. This allows for it to conduct wear leveling - a process the purpose of which is to prevent particular blocks from being erased more than others. As already discussed, the purpose of such a practice is to reduce hot-spots. For example, wear-leveling writes data that does not change to blocks that have been erased frequently thus decreasing the probability that those blocks will be erased even more. Similarly, it tries to write evenly to all the blocks so as to maximize the hard-drive's overall lifespan. [3]

The FTL is what makes it possible to combine the above features with a more traditional memory-management structure. It has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. The FTL is able to achieve this by giving each block a flag which keeps track of its state. When blocks are being written to, the FTL marks them as allocated.

When a block is being written to, the FTL marks the blocks needed as "allocated". This prevents other data from being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. After this is finished, the relevant block's state becmes "pre-valid". Next, the block is marked as valid. [WHAT IS THIS INVALIDATED BLOCK BUSINESS? WHAT IS AN INVALIDATED BLOCK?]

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased. [8]

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

===Why a Log File System is efficient for flash had drive===
The file system only writes a bank at a time. This means that the OS can save up the small random writes and write them all at the same time in a bank. This will cut down on the use of the expensive write command improving the overall performance of the hard drive.[9]

If a collision occurs when writing a new bank, the file system will sends the new data to an empty bank rather then erasing the existing bank and replacing it. This will cut down on using the erase function improving the life of the hard drive. Also since it only performs writes command and not a erase command and write command like a traditional file system this improves performance of the drive as well.[16]

===Systems Developed For Flash===
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory.
As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14]

When writing data to the drive, the nodes carrying that data get attached to the end of the log.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T10:27:23Z

Filitche:

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T03:13:28Z

Filitche:

COMP 3000 Essay 1 2010 Question 10

2010-10-15T03:11:59Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

'''Possibly more on this later'''

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Transition Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" [WHAT IS THIS?] is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

=Conclusion=

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

=old=

Hey guys,

This is what I've got so far... mostly based on wikipedia:

Flash memory has two limitations: it can only be erased in blocks and and it wears out after a certain number of erase cycles. Furthermore, a particular kind of Flash memory (NAND) is not able to provide random access.
As a result of these Flash based file-systems cannot be handled in the same way as disk-based file systems. Here are a few of the key differences:

- Because memory must be erased in blocks, its erasure tends to take up time. Consequently, it is necessary to time the erasures in a way so as not to interfere with the efficiency of the system’s other operations. This is is not a real concern with disk-based file-systems.
- A disk file-system needs to minimize the seeking time, but Flash file-system does not concern itself with this as it doesn’t have a disk.
- A flash system tries to distribute memory in such a way so as not to make a particular block of memory subject to a disproportionally large number of erasures. The purpose of this is to keep the block from wearing out prematurely. The result of it is that memory needs to be distributed differently than in a disk based file-system.
Log-sturctured file systems are thus best suited to dealing with flash memory (they apparently do all of the above things).

For the essay form, I'm thinking of doing a section about traditional hard-disk systems, another about flash-memory and a third about flash systems. At this point, I am imagining the thesis as something like, "Flash systems require a fundamentally different system architecture than disk-based systems due to their need to adapt to the constraints inherent in flash memory: specifically, due to that memory's limited life-span and block-based erasures." The argument would then talk about how these two differences directly lead to a new FS approach.

That's how I see it at the moment. Honestly, I don't like doing research about this kind of stuff, so my data isn't very deep. That said, if you guys could find more info and summarize it, I'm pretty sure that I could synthesize it all into a coherent essay.

Fedor

COMP 3000 Essay 1 2010 Question 10

2010-10-15T02:56:01Z

Filitche:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

'''Possibly more on this later'''

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Transition Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" [WHAT IS THIS?] is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space.

=Conclusion=

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

=old=

Hey guys,

This is what I've got so far... mostly based on wikipedia:

Flash memory has two limitations: it can only be erased in blocks and and it wears out after a certain number of erase cycles. Furthermore, a particular kind of Flash memory (NAND) is not able to provide random access.
As a result of these Flash based file-systems cannot be handled in the same way as disk-based file systems. Here are a few of the key differences:

- Because memory must be erased in blocks, its erasure tends to take up time. Consequently, it is necessary to time the erasures in a way so as not to interfere with the efficiency of the system’s other operations. This is is not a real concern with disk-based file-systems.
- A disk file-system needs to minimize the seeking time, but Flash file-system does not concern itself with this as it doesn’t have a disk.
- A flash system tries to distribute memory in such a way so as not to make a particular block of memory subject to a disproportionally large number of erasures. The purpose of this is to keep the block from wearing out prematurely. The result of it is that memory needs to be distributed differently than in a disk based file-system.
Log-sturctured file systems are thus best suited to dealing with flash memory (they apparently do all of the above things).

For the essay form, I'm thinking of doing a section about traditional hard-disk systems, another about flash-memory and a third about flash systems. At this point, I am imagining the thesis as something like, "Flash systems require a fundamentally different system architecture than disk-based systems due to their need to adapt to the constraints inherent in flash memory: specifically, due to that memory's limited life-span and block-based erasures." The argument would then talk about how these two differences directly lead to a new FS approach.

That's how I see it at the moment. Honestly, I don't like doing research about this kind of stuff, so my data isn't very deep. That said, if you guys could find more info and summarize it, I'm pretty sure that I could synthesize it all into a coherent essay.

Fedor

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T02:52:27Z

Filitche:

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-15T01:46:47Z

Filitche: