COMP 3000 Essay 1 2010 Question 10: Difference between revisions
Line 44: | Line 44: | ||
===Systems Developed For Flash=== | ===Systems Developed For Flash=== | ||
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory. | In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory. | ||
As the directory gets built, the nodes containing the data also map the physical location for each piece of data. | As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14] | ||
When writing data to the drive, the nodes carrying that data get attached to the end of the log. | When writing data to the drive, the nodes carrying that data get attached to the end of the log. |
Revision as of 07:20, 15 October 2010
Question
How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.
Answer
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].
Flash Memory
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]
More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]
This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.
Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]
The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.
HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".
Traditionally Optimized File Systems
Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.
The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.
This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as wear leveling. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.
Flash Optimized File Systems
The process of wear leveling ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as allocated. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to pre-valid. Once that is completed, the drive marks the invalidated blocks to invalid, while marking the newly written block as valid. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.
Banks
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.
Cleaner
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.
Why a Log File System is efficient for flash had drive
The file system only writes a bank at a time. This means that the OS can save up the small random writes and write them all at the same time in a bank. This will cut down on the use of the expensive write command improving the overall performance of the hard drive.[9]
If a collision occurs when writing a new bank, the file system will sends the new data to an empty bank rather then erasing the existing bank and replacing it. This will cut down on using the erase function improving the life of the hard drive. Also since it only performs writes command and not a erase command and write command like a traditional file system this improves performance of the drive as well.[16]
Systems Developed For Flash
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory. As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14]
When writing data to the drive, the nodes carrying that data get attached to the end of the log.
Conclusion
In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]
Questions
References
[1] Kim, Han-joon; Lee, Sang-goo. A New Flash Memory Management for Flash Storage System. IEEExplore. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>
[2] Smith, Lance. NAND Flash Solid State Storage Performance and Capability. Flash Memory Summit. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>
[3] Chang, LiPin. On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems. Association for Computing Machinery (ACM). Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>
[4] Nath, Suman; Gibbons, Phillip. Online maintenance of very large random samples on flash storage. Association for Computing Machinery (ACM). The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>
[5] Lim, Seung-Ho; Park; Kyu-Ho. An Efficient NAND Flash File System for Flash Memory Storage. CORE Laboratory. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>
[6] NAND vs. NOR Flash Memory Technology Overview. RMG and Associates. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>
[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. Introduction to Flash Memory. IEEExplore. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>
[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. A Flash-Memory Based File System. CiteSeerX Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>
[9] Rosenblum, Mendel; Ouserhout, John. The Design and Implementation of a Log-structured File System. Association for Computing Machinery (ACM). University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>
[10] Shimpi, Anand. Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives. AnAndTech. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>
[11] Shimpi, Anand. The SSD Relapse: Understanding and Choosing the Best SSD. AnAndTech. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>
[12] Shimpi, Anand. The SSD Anthology: Understanding SSDs and New Drives from OCZ. AnAndTech. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>
[13] Corbet, Jonathan. Solid-State Storage Devices and the Block Layer. Linux Weekly News. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>
[14] Woodhouse, David. JFFS : The Journalling Flash File System. CiteSeerX. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>
[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. Design Tradeoffs for SSD Performance. Association for Computing Machinery (ACM), USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>
[16] Lee, Sang-Won, et al. A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation. Association for Computing Machinery (ACM). ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>
[17] Reach New Heights in Computing Performance. Micron Technology Inc. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>
[18] Flash Memories. 1 ed. New York: Springer, 1999. Print.
[19] Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices. IEEE Press Series on Microelectronic Systems. New York: Wiley-Ieee Press, 2008. Print.
[20] Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices. IEEE Press Series on Microelectronic Systems. New York: Wiley-Ieee Press, 1997. Print.
External links
Relevant Wikipedia articles: Flash Memory, LogFS, Hard Disk Drives, Wear Leveling, Hot Spots, Sold State Drive.