Talk:COMP 3000 Essay 1 2010 Question 10
Hey all,
I think we should write down our emails here so we can further discuss stuff without having to login here. (***Note that discussions over email can't be counted towards your participation grade!***--Anil)
Geoff Smith (gsmith0413@gmail.com) - gsmith6
Andrew Bujáki (abujaki [at] Connect or Live.ca)
- I'm usually on MSN(Live) for collaboration at nights, Just make sure to put in a little message about who you are when you're adding me. :)
I used Google Scholar and came to this page http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1# Which briefly touches on the issues of Flash memory. Specifically, inability to update in place, and limited write/erase cycles.
Inability to update in place could refer to the way the flash disk is programmed, instead of bit-by-bit, it is programmed block-by-block. A block would have to be erased and completely reprogrammed in order to flip one bit after it's been set. http://en.wikipedia.org/wiki/Flash_memory#Block_erasure
Limited write/erase: Flash memory typically has a short lifespan if it's being used a lot. Writing and erasing the memory (Changing, updating, etc) Will wear it out. Flash memory has a finite amount of writes, (varying on manufacturer, models, etc), and once they've been used up, you'll get bad sectors, corrupt data, and generally be SOL. http://en.wikipedia.org/wiki/Flash_memory#Memory_wear
Filesystems would have to be changed to play nicely with these constraints, where it must use blocks efficiently and nicely, and minimize writing/erasing as much as possible.
I found a paper that talks about the performance, capabilities and limitations of NAND flash storage.
Abstract: "This presentation provides an in-depth examination of the fundamental theoretical performance, capabilities, and limitations of NAND Flash-based Solid State Storage (SSS). The tutorial will explore the raw performance capabilities of NAND Flash, and limitations to performance imposed by mitigation of reliability issues, interfaces, protocols, and technology types. Best practices for system integration of SSS will be discussed. Performance achievements will be reviewed for various products and applications. "
Link: http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf
There's no Starting place like Wikipedia, even if you shouldn't source it.
http://en.wikipedia.org/wiki/Flash_Memory
http://en.wikipedia.org/wiki/LogFS
http://en.wikipedia.org/wiki/Hard_disk
http://en.wikipedia.org/wiki/Wear_leveling
http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29
http://en.wikipedia.org/wiki/Solid-state_drive
Hey Guys,
We really don't have much time to get this done. Lets meet tomorrow after class and get our bearings to do this properly.
Fedor
A few of us have Networking immediately after class. I know personally I won't be able to make anything set on Tuesday.
Additionally, he spoke briefly about hotspots on the disk for our question last week, where places on the disk would be written to far more often than others.
As well, for bibliographical citing, http://bibme.org is a wonderful resource for the popular formats (I.e. MLA). If it should come down to that.
~Andrew
links
Start Posting some stuff to source from:
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1 --"Introduction to flash memory"
http://portal.acm.org/citation.cfm?id=1244248 --"Wear Leveling" (it's about a proposed way of doing it, but explains a whole bunch of other things to do that)
http://portal.acm.org/citation.cfm?id=1731355 --"Online maintenance of very large random samples on flash storage" (ie dealing with the constraints of Flash Storage in a system that might actually be written to 100000 times)
http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf --"An Efficient NAND Flash File System for Flash Memory Storage" discuses shortcomings of using hard disk based file systems and current flash based file systems
http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf --"NAND vs NOR Flash Memory" (note: i didn't get this off of Google scholar but it seems to be written by someone from Toshiba. is that ok?)
Hi everybody,
So here are the latest news. Geoff, Andrew and myself had a meeting after class today and came up with a plan for writing this thing.
We decided to have 3 parts:
1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?) [don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.]
Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say. With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts. ~Source: Years in the 'biz
Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips. source: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1 , http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf
2. How a traditional disk-based file-system works and why the limitations of flash storage make the two a poor match [the obvious answer seems to be that traditional file-systems could just write to whatever memory was available but if they did this with a flash file-systems, certain chunks of memory would become unusable before others and the memory would be more difficult to work with. Also, disk-based file systems need to deal with seeking times which means that they want to organize their data in such a way as to reduce those (by putting related things together?) - with Flash, this isn't really a problem and thus one constraint the less to be concerned with.]
3. How a log based file-system works and why this method of operation is so well suited to working with flash memory especially in light of the latter's inherent limitations [...]
At this time, the plan is that Geoff will work on #3 today, Andrew will work on #1 tomorrow and I will work on #2 tomorrow. The three of us will make an effort to consult some somewhat more painfully technical literature in order to gain insight into our respective queries. Whatever insight we find will be posted here.
Then, we will meet again on Thursday after class to decide how to actually write the essay.
PS, if there is anybody in the group besides the three of us - let us know so you can find a way to contribute to this... as at least two of us are competent essayists, painfully technical research would on one or more of the above topics would be a great way to contribute... especially if you could post it here prior to one of us going over the same thing.
Fedor
-- I'm not that great (but absolutely horrid) at essays and I'm alright at research, but if nothing else I have Thursday off and nothing (else) that needs doing by Friday so I can probably spend a bunch of time working on it just before it's due. -- Nick L
-- Hay sorry I was unable to attend the meeting after class today. I am not too good at writing essays as well but I am pretty good at summarizing and researching. I am not too sure at what you would like me to do. Right now I'll assume you need me to research/summarizing articles for the 3 topics above. If you need me to do anything else post it here. I'll be checking the discussion regularly until this due. once again sorry for missing the meeting-- Paul Cox.
-- Hey i'm also supposed to be in on this. Sorry i couldn't contribute sooner because i was playing catchup in my other classes. Let me know what i can do and i'll be on it asap. - kirill (k.kashigin@gmail.com) update: i'm gonna be helping Fedor with #2
PS, this article http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw looks really useful for part 3.
---same article as above but shorter link: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142
PPS, and this article looks really great for understanding how log based file systems work: http://delivery.acm.org/10.1145/150000/146943/p26-rosenblum.pdf?key1=146943&key2=3656986821&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973
Hey Luc (TA) here, Anandtech ran a series of articles on solid state drives that you guys might find useful. It mostly looked at hardware aspects but it gives some interesting insights on how to modify file systems to better support flash memory.
http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3403
http://www.anandtech.com/storage/showdoc.aspx?i=3531&p=1
http://anandtech.com/storage/showdoc.aspx?i=3631
--3maisons 19:44, 12 October 2010 (UTC)
Hey Paul&Kirill,
If one of you guys could help me out with #2, that would be really great. I was going to work on that tomorrow, but I also have another large assignment to deal with and not having to do this research would greatly ease my life. Moreover, I do intend to work on writing&polishing the essay on Thursday as I have a lot of experience with that and it far more than research. Let me know if either one of you can help me with this.
The other person could probably read over what Luc posted for us and see if it fits into our framework. Just be sure to state who is going to do what.
Nick,
Honestly, we really hope to have the research done by Thursday. If that is the only day that you are free and you're not a writer, I'm honestly not sure what you could do. Perhaps someone else can think of something.
- Fedor
I'm gonna have something for #2 up tonight. -kirill
So I found this article on Reddit, posted from Linux Weekly News on pretty much exactly what we are looking at. It's entitled "Solid-state storage devices and the block layer"
http://lwn.net/SubscriberLink/408428/68fa8465da45967a/ --Gsmith6 20:36, 13 October 2010 (UTC)
I wasn't exactly sure how much information i was supposed to present but here's what i got for #2:
Most conventional file systems are designed to me implemented on hard disk drives. This fact does not mean they cannot be implemented on a solid state drive (file storage that uses flash memory instead of magnetic discs). It would however, in many ways, defeat the purpose of using flash memory. The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically so there is no need to waste resources laying out the data in close proximity. A traditional HDD file system will also attempt to defragment itself, moving blocks of data around for closer proximity on the magnetic disk. This process, although beneficial for HDD's, is harmful and inefficient for flash based storage. A flash optimal file system needs to reduce the amount of erase operations, since flash memory only has a limited amount of erase cycles as well as having very slow erase speeds. When an HDD rewrites data to a physical location there is no need for it to erase the previously occupying data first, so a traditional disk based file system doesn't worry about erasing data from unused memory blocks. In contrast flash memory needs to first erase the data block before it can modify any of it contents. Since the erase procedure is extremely slow, its not practical to overwrite old data every time. It is also decremental to the life span of flash memory. To maximize the potential of flash based memory the file system would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to erase unused blocks when the system is idle, which does not get implemented in conventional file systems since it is not needed.
--kirill
So Fedor and I were talking in the labs, and we came to the conclusion that we have been focusing on just the translation from a regular file system to a flash drive. We were under the impression that this was in fact the "Flash Optimized System", but pulling up some more articles, I'm finding that this is not necessarily the case.
This paper here http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf. shows an example from Axis Communications where they developed a file system specifically designed to be used on flash drives.
Now I haven't completely read it, so it might just be an optimized translational system, but its at least a start.
At our meeting today, we've decided that it would be best if people could post a rough summary of their notes in the appropriate sections, and I will rewrite them into an essay, which Fedor will go through later tonight to edit and add some more information.
Paul: I missed your comment that you weren't that great at writing, and good at research. If you want some articles behind the pay-walls, I've saved a bunch of them and emailed them to myself. Just email me (at the address at the top of the page) and I'll be more than happy to send some your way.
PS. some more references
Design tradeoffs for SSD performance http://portal.acm.org/citation.cfm?id=1404014.1404019 A log buffer-based flash translation layer using fully-associative sector translation http://delivery.acm.org/10.1145/1280000/1275990/a18-lee.pdf?key1=1275990&key2=0709607821&coll=GUIDE&dl=GUIDE&CFID=105787273&CFTOKEN=74601780
--Gsmith6 15:03, 14 October 2010 (UTC)
I don't have any notes on this computer. >: I will be adding more to my section later on tonight. Sorry. ~Andrew
Hello dudes,
Just a quick note, try to include citations in your paragraphs - each time that you make a claim which came from evidence, put a little number [X, pp. page-number (if applicable)] into your text. Then, put the same [X] at the bottom of the page with the bibliographical information about the source. The prof hasn't yet gotten back to me about his preferred citation format, so just stick with this one for now:
Authors. Title. Web-page. Date of article. Web (the word). Date you accessed it.
Here's an example:
[1] Kawaguchi, Nishioka, Tamoda. A Flash Memory Based File System. http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw . 1995. Web. Oct. 14, 2010.
Fedor
PS, its a good idea to check this fairly frequently between now and tomorrow morning - you never know when something will come up.
Phew... for a while there I was starting to think that I had nothing about the actual "Log-Based System", but it turns out that the "Transitional Layer" is the same thing. It looks like some articles are calling it the Log system, while others are calling it the transitional layer. Pretty sure I'm going to have an experts knowledge about flash drives after reading all these articles :P
--Gsmith6 18:13, 14 October 2010 (UTC)
Hay Geoff this is what i got so far after reading a couple of the pdfs. the double tabed points are just my annotation on how they relate to the question.
Ware leveling: p1126-chang.pdf
- Uneven wearing of flash memory due to storing data close together
- Garbage collection prefers that no blocks have pages that have data that is constantly becoming invalid
- data that remains the same for longs periods of time should be moved from block that have not be written to much and moved to blocks that haven been erased frequently.
Log file structure: 926-rosenblum.pdf
- LFS based on assumption that frequently read files will be stored in cash and that the hard disk traffic will be dominated by writes
- Writes all new info to disk in a sequential structure called a log
- Data is stored permanently in these logs no other data is stored on the hard drive
- Converts many small random synchronous writes to a large asynchronous sequential write
- Good for flash because it cuts down on writing (prolongs drive life)
- It also good because it writes to a bigger section then a page. This means it can fill a block at a time so it doesn’t fill up other blocks with random writes that would later need to be cleaned. Cuts down on cleaning.
- Inode is stored in the log on the disk while an inode map is maintained in memory which points to the inode in the hard disk. as
- This is good for flash drives because reading does not hurt the drives life and it is fast.
- This means the map will not have to be updated on the disk as frequently cutting down on the writes.
- Log systems weakness is that it is susceptible to becoming fragmented due to the larger writes.
- Since flash drives do not require fragmentation this is fine. Also since flash drives have very fast random access the system does not become bogged down when the logs are fragmented
- Log system implements a cleaning system that scans a segment in and sees if there is live data in it. If there is a certain percentage of invalid data which goes according to the cleaning policy it will be cleaned. All the live data will be copied out and the segment will be erased.
- This is a garbage collector but its built into the file system.
- Segments contain a number of blocks. Segments can contain logs or parts of logs.
I'm still sifting though the other pdfs you sent me. Would you like me to post more info or should i format this into a paragraph or 2?
--Paul
I'm sorting through my notes and putting them into a digital format. Difficult because It's hard to remember what I copied directly out of the notes for my own reference, so I'm trying to de-plagarize. I'll upload what I have to the section by 20:30, but expect more updates. ~Andrew
Hey Paul,
Thanks for helping out with the research I'm thinking that we probably have enough information (at least for the log-based system), so maybe if you could "try" and putting them into paragraphs. I'm kind of in the middle of typing up the essay, and I'm uploading new information every 10-20 minutes or so. Even if you could get something into paragraph form, I can go through and edit, rearrange, and add stuff as I get there. --Gsmith6 23:51, 14 October 2010 (UTC)
I'm just going to dump my entire wiki so far into my profile page, because everything's being updated so quickly. User:Abujaki ~Andrew
I'm sitting at my computer working on much less immediately important homework if someone wants me to do something. I'm not a great writer but if there's something that needs doing I can do it. -- Nick Nlessard 00:55, 15 October 2010 (UTC)
Hey guys,
Keep up the good work!
Geoff, it would be good if you uploaded the stuff you were typing as you got each section finished. I rewrote part 2 and the intro some hours ago and I guess I could go out and write a conclusion while stuff is still being updated.
The other thing is that the current version of part 1 doesn't really seem to serve the argument - I mean, it doesn't actually tell us why flash has the constraints that it does on a technical level... and that's not surprising because that information seems to be hard to find. Still, I'm going to see if I can find out anything about that, though, so far, I haven't had much luck.
Apart from this, I'll try to get up early tomorrow to look over what's there: make sure the argument works, etc.
Good luck with the writing,
Fedor
I think it has to do with how some transistors are made. Give me a second to research that... ~Andrew
From source 20, page 269 sec 4.6.4 Stress Induced Oxide leakage ...These devices are either erased ... orboth programmed and erased by a high field (Fowler-Nordheim) injection of electrons into a very thin dielectric film to charge and discharge the floating gate. Unfortunately, [passing] this current through the oxide degrades the quality of the oxide and eventually leads to breakdown. The wearout of tunnel oxide films during high-field stress has been correlated with the buildup of both positive or negative trapped charges."... there are still some things missing from this explanation... Give me a couple more seconds... and my brain won't process this for some reason. =_= What can you guys make of this? ~Andrew
And I said NOR Flash memory had fast fetch times. I found what I was looking for, just not too sure how to explain it in text. Silicon Oxide is used for electron injection methods, including the Fowler-Nordheim Tunneling method, where it seems to say that an electric current is passed across the oxide and it thins enough for a current to be passed through a barrier into the oxide conduction... band?
Kay. New resource [19], the massively complicated theories and science would put anyone to sleep, but the basic point is The Silicon oxide layer will degrade in quality to the point where charge trapping begins to occur, the more it's used, and the more electrons that tunnel through it over time. It's like a kneaded eraser scenario. The more you use it, the poorer quality it gets until it gets all flakey and cannot do its job anymore. The same happens with the SiO2 layer in the floating gate transistors ~Andrew
hmmm... alright, I think I understand that, could you maybe try your best and add it into the essay? If not I can try and whip something up to put in there. thanks --Gsmith6 02:39, 15 October 2010 (UTC)
Wow, I'm sorry I made you go into that... that does explain the limited erase-cycles, though. Was it the use of these materials that made flash innovative? How does that explain the block-sized erasures?
Going to go write a conclusion now.
Fedor
While looking for what you requested, I found an entire section on the oxide breakdown... oops. :P
So really nothing changes, As electrons tunnel through the oxide layer, it deteriorates. When the damage becomes too great, the oxide abruptly loses its insulative properties. (damn firefox, that's a word) Looking for the erasures now ~Andrew
Hey guys,
I'm off to sleep, but I'll be up at 5am to see what else I can do.
Fedor
Aw, I just missed you.
It turns out that erasing entire blocks is necessary based on how the chip is built. For both NOR and NAND, pairs of word lines share an erase line, which is located near the floating gate. It uses FN tunneling to erase two lines of words at a time, making the sector the smallest size you can erase simply because that's how it's physically wired.
i.e []-mem byte |-erase line
- []|[][]|[]
- []|[][]|[]
- []|[][]|[]
- []|[][]|[] ... If that makes any sense... The erase line is separate from the byte select or word select lines
Numbers for formatting sake.
I'm prolly gonna take Fedor's advice and head to bed, but I'll likely pop back on before I go to sleep in case anyone needs me to look up anything else. :P
--Lmundt 03:23, 15 October 2010 (UTC) Interesting article compaing speeds https://noc.sara.nl/nrg/publications/RoN2010-D1.1.pdf
Wow, I can't believe I completely forgot. Hybrid Drives... I can't remember for the life of me whether they exist yet or not, but there will eventually exist a hard drive that's both flash and traditional platter. It allocates files that are not rewritten often, like an OS (well except windows and all its lovely updates) to the flash memory on install, and the remainder of the files (Documents, programs etc) to the traditional platter. Because you can't reformat an OS 100,000 times (no matter HOW badly you screw it up with viruses and the like), the flash memory will retain usability, and all the hotspots are on the platters, so you don't need to worry about minimizing disk usage. I'm out for the night. Cheers all! ~Andrew
Alright guys... I've done as much as I can for tonight. I'm exhausted and I'm having troubles concentrating anymore. I added a few questions for people to answer as thehttp://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_1_2010_Question_10y read. I'm going to bed, hopefully people will be up early enough to fix some things that I may have missed because I sure as hell won't be :P. Really wish I could keep going though. --Gsmith6 07:38, 15 October 2010 (UTC)
Hey y'all,
So, I'm just rewriting part 3 as we speak, trying to understand the invalid/pre-valid stuff - what is the purpose of this with relation to what we are talking about? Does it achieve even writing, or just keeps the same block from being written to by two processes at once? Also, is an invalidated block one that has been written to too many times?
Fedor
Moving right along, I'm still re-writing P3 and posting as I go. Some things are not clear to me, however, I have written those in [BRACKETED CAPITALS], if any of your are online, it would be useful if you could post disambiguations here.
Fedor
OK, I've gone over part 3. Certain things still don't necessarily make perfect and consistent sense. If one of you who has a solid understanding of of log-systems and how they work could go over it and see if what I've written is consistent with your understanding, that would be great.
Fedor
OK, I emailed Anil and he said we could take a bit of extra time (couple of hours?) to get the essay to its best possible form.
Here are the things I still find problematic about it:
- as I said, I am not positive about the data in part 3... particularly the invalid/prevalid stuff. As I said, I'd like for someone who understands this stuff to read what I've written and tell me if any connections have been omitted or misrepresented.
- part 1 is a messy collection of data. My original idea had been for it to explain what was innovative about flash design and how that lead to its particular advantages and disadvantages... but I don't think that we've found this information... I am pretty sure it has to do with the materials and the cirquit: the materials give a shorter life-span, but do they confer an advantage?... the circuit structure is what allows for for the fast access and compactness but is also what forces block-sized erasures. Right?
- we could use a final sentence... but unless it is both inspired and organically related to what the paper talks about, its better to leave it out.
Fedor
PS, I took away the bracketed caps for cosmetics but my concerns remain.