Soma-notes - User contributions [en]

Talk:COMP 3000 Essay 2 2010 Question 4

2010-11-25T05:43:22Z

Pcox: /* Critique */

= Group Essay 2 =

Hello Group. Please post your information here. I assume everybody read the email at your connect account. Anyone specific wants to send him the email with the group members inside? If not, I just go ahead tomorrow at about 13:00 and send the email with the group members who wrote their contact information in here. - [[User:Sschnei1|Sschnei1]] 03:25, 15 November 2010 (UTC)
<br />

Sebastian Schneider sschnei1@connect.carleton.ca

Matthew Chou mchou2@connect.carleton.ca

Mark Walts mwalts@connect.carleton.ca

Henry Irving hirving@connect.carleton.ca

Jean-Benoit Aubin jbaubin@connect.carleton.ca

Pradhan Nishant npradhan npradhan@connect.carleton.ca

Only Paul Cox didn't answer i sent this morning.

Cox Paul pcox

And I just sent an email to the teacher.

--Jean-Benoit

==Paper==

the paper's title, authors, and their affiliations. Include a link to the paper and any particularly helpful supplementary information.

'''Title:''' Accountable Virtual Machines

'''Authors:''' Andreas Haeberlen, Paarijaat Aditya, Rodrigo Rodrigues, Peter Druschel

'''Affiliates:'''
University of Pennsylvania, Max Planck Institute for Software Systems (MPI-SWS)]

'''Link to Paper:''' [http://www.usenix.org/events/osdi10/tech/full_papers/Haeberlen.pdf Accountable Virtual Machines]

'''Supplementary Information:''' [http://research.microsoft.com/en-us/people/sriram/druschel.pptx Accountable distributed systems and the accountable cloud] - background of similar AVM implementation for distributed systems.

==Background Concepts==

Explain briefly the background concepts and ideas that your fellow classmates will need to know first in order to understand your assigned paper.

'''Accountable Virtual Machine (AVM)'''

'''Deterministic Replay''': A machine can record its executions into a file so that it can be replayed in order to see the executions and follow what was happening on the machine. Remus [[#References | [1]]] has contributed a highly efficient snap-shotting mechanism for these replays.

'''Accountability:''' Accountability in the context of this paper means that every action done on the virtual machine is recorded and will be used against the machine or user to verify the correctness of the application. The AVM is responsible of its action and will answers for its action against an auditor.

'''Remote Fault Detection:''' There are programs like GridCop[[#References | [2]]] that can be used to monitor the progress and execution of a remotely executing program by requesting a beacon packet. When the remote computer is sending the packets, the receiving/logging computer must be a trusted computer (hardware,software, OS) so that the receiving of packets remains consistent. To detect a fault in a remote system, every packet must arrive safely, and any interrupts during the logging must be handled or the inconsistencies will result in an inaccurate outcome. The AVM does not require trusted hardware and can be used over wide-area networks.

'''Cheat Detection:''' Cheating in games or any specific modification in a program can be either scanned[[#References | [3][4]]] for or prevented[[#References | [5][6]]] by certain programs. The issue with these scanning and preventative software is the knowledge/awareness of specific cheats or situations that the software can handle. An AVM is designed to counter any kind of general cheat.

'''Integrity Violations:''' This refers how the consistency of normal/expected operations of an execution does not equal to that of the host/reference (Trusted) execution, hence a violation has occurred.

- The word "node" is used to refer to a computer or server in order to represent the interactions between one computer and another, or a computer and a server.

==Research problem==

What is the research problem being addressed by the paper? How does this problem relate to past related work?
**Possible alternative for the first part :

The research presented in this paper tries to tackle a problem that has haunted computer scientists for a long time. How can you be sure that the software running on a remote machine is working correctly or as intended. Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a trust relation between users and a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done would be independent of the node and only dependent on the intended software. Let's say, that node A interacts with node B with execution exe1 and node A interacts with node C also with ex1, but node C has been modified and respond with exe2. Thus, we can assume that the respond of B and C will be different. Being able to prove that the node C has been modified without any doubt is the purpose of this paper.
***Let me know what you think about it. I removed the redundant part, and I think made it clearer and more concise. [[User:Jbaubin|Jbaubin]]

** looks good to me, we'll put this part into the final essay instead of mine below --[[User:Mchou2|Mchou2]] 20:03, 22 November 2010 (UTC)

/// omit

Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a system of trust between users and a host. These different examples must have a certain amount of trust between the interactions of one user and another, as well as the user interacting with a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done with node A is the same it would be done with another node, node B. Let's say for example that node A interacts with node B with execution exe1, now when node A and B interact with node C, they would both expect to interact with execution exe1, but what happens if node C interacts differently and executes with exe2, then it would be beneficial to be notified of this difference. The previous explanation might not seem too relevant without some examples, such as; Node A is playing a game with node B, the game executed on node B is the same as on A, now when node A plays with node C, node C is executing the same operations as node A plus a cheating program; when node A buys some products from node B's server, the server processes the order and then deletes node A's sensitive information, denoted by execution 1, now when node A buys from node C's server, the order is processed as well as the sensitive information that node A has provided is also rerouted to another server so that it can be used without permission. These are only a few examples where the operations in an execution is necessary to be logged and verified. The problem that is trying to be handled here is to create a procedure that can be done so that a node can be known as accountable, and to log the operations in an execution to provide evidence of these faults done by a node.

////

Previous work that has been done in efforts to prevent or detect integrity violations can be separated into different categories of operations. The first would be Cheat Detection, where in many different games there are cheats that users use to usually create benefits for themselves that was not intended by the original game.[[#References |[4]]] These detectors are not dynamic, in the sense that they do not actually detect whether a cheat is being used, more so they are checking if there is a cheating operation that they have logged before, being operated on the user's system. For example, if there was a known cheating program named aimbot.exe that can be run in the background of a game such as CounterStrike, and the PunkBuster system that was implemented on the user's system had the aimbot.exe program already logged as a cheating program from the developers, the PunkBuster program might notify the current game servers of this or even prevent the user from playing any games until the aimbot.exe operation is no longer running.

Accountability is another important problem that many have already worked on. The main goal of an accountable system is to be able to determine without a doubt that node is faulty and can prove it with solid evidence. It can also be used to defend a node when threatened with false accusation. Numerous systems already use accountability in their system, but they were mostly all linked to specific applications, where a point of reference must be used to compare. As example PeerReview[[#References |[7]]], which is a system closely related to what the research team have worked on, must be implemented into the application which makes it less portable and cannot be implemented as easily as an AVM. PeerReview verifies the inbound and outbound packets and can see if the software is running as intended.

Another problem that is related to the paper is remote fault detection in a distributed system. How can we determine if a remote node is running the code correctly or if the machine itself is working as intended. Network activity is a common solution to this problem, as they look at the inbound and outbound of the node. This can let them know how the software is operating, or in the case of AVM how the whole virtual machine is working. Gridcop[[#References |[8]]] is another example that inspects a small number of packets periodically. Another way of determining the fault remotely is to use a trusted node, where it can tell immediately if a fault occurs or a modification is made where it should not have been made.

-and anything else you would to add or modify, or leave a note in the discussion sections if you want me to relook or change something. --[[User:Mchou2|Mchou2]] 20:10, 21 November 2010 (UTC)

The problem of logging and auditing the processes of an execution of a specific node (computer) is greatly dependent on the work done for deterministic replay. Deterministic replay programs can create a log file that can be used to replay the operations done for some execution that occurs on a node. Replaying the operations done on the node can show what the node was doing, and this would seem like it is sufficient in finding out whether a node was causing integrity violations or not. The concept of snap-shoting/recording the operations is not the issue with deterministic replay, it is the fact that the data being outputted into the replay may be tampered with by the node itself so that it generates optimal results in replay. By faking the results of the operations, the auditing computer will falsely believe that the tested computer is running all operations as normal. The logging operations done by these recording programs can be directly related to the work needed to detect integrity violations.

==Contribution==

What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)

The accountable virtual machine (AVM), that was proposed in this essay, most useful contribution was the implementation of the accountable virtual machine monitor (AVMM). It is what allows for the fault checking of virtual machines in a cloud computing environment. The AVMM can be broken down into different parts: the virtual machine monitor (VMM), the temper-evident log, and auditing mechanisms. The VMM is based off the VMM found in VMWare Workstation 6.5.1[[#References |[9]]], the temper-evident log was adapted from code in PeerReview[[#References |[7]]], and the audit tools were built up from scratch.

The accountable virtual machine monitor relies on four assumptions:

1. All transmitted messages are received, if retransmitted sufficiently often.

2. Machines and Users have access to a has function that is pre-image resistant, second pre-image resistant, and collision resistant.

3. All parties have a certified keypair, that can be used to sign messages.

4. To audit a log, the user has a reference copy of the VM used.
The job of the AVMM is to record all incoming and outgoing messages to tamper-evident log.
and enough info of the execution to enable deterministic replay.

The AVMM must record nondeterministic inputs (such as hardware interrupts), because the input is asynchronous, the exact timing of input must be recorded, so that the inputs can be injected at the same points during replay. Wall-clock time is not accurate enough for this recording, so the AVMM must use a combination of instruction pointer, branch counter, and, possibly, additional registers. Not all inputs have to be recorded this way (software interrupts), because, they send requests to the AVM, which will be issued again during replay.

Two parallel streams appear in the tamper-evident log: message exchanges and nondeterministic inputs.
It is important for the AVMM to detect inconsistencies between the user's log and the machine's log (in case of foul play), so the AVMM simply cross-references messages and inputs during replay, thus, easily detecting any discrepancies.

The AVMM periodically takes snapshots of the AVM's current state, this facilitates fine-grain audits for the user, but it also increases overhead. The overhead is lowered slightly by the snapshots being incremental (only save the state that has been changed since the last snapshot). The user can authenticate the snapshot using a hash tree of the state (generated by the AVMM), the AVMM updates the hash tree after each snapshot.

'''Tamper-Evident Log'''

The log is made up of hash code entries.
Each log entry in form e = (s,t,c,h)
s = monotonically increasing sequence number
t = type
c = data of the type
h = hash value

The hash value is calculated by: h = H(hi-1 || s || t || H(c))
H() is a hash function.
|| stands for concatenation

Each message sent gets signed with a private key, when the AVMM logs the messages with the signature attached but removes it before sending it to the AVM. To ensure nonrepudiation, an authenticator is attached to each outgoing message.

To detect when a message is dropped, each party sends an acknowledgement for each message they receive. If an acknowledgement is not received the message is resent a few times, if the user stops receiving messages, then the machine is presumed to of failed.

To preform a log check, the user retrieves a pair of authenticators, then challenges the machine to produce the log segment between the two. The log is computationally infeasible to edit without breaking the hash chain, thus, if the log has been tampered with the hash chain will be different and the user will notified of the tampering.

'''Auditing Mechanism'''

From VVM's perspective all things are deterministic.

To preform a audit the user:

1. obtains a segment of the machine's log and the authenticators

2. downloads a snapshot of the AVM at the beginning of the segment

3. replays the entire segment, starting from the snapshot, to verify the events in the log are the correct execution of the software.

The user can verify the execution of software through three different methods: Verifying the log, snapshot, and execution.

When the user wants to verify a log segment, the user retrieves the authenticators from the machine with the sequence numbers in the range of the log segment. The user then downloads the log segment from the machine, and, starting with the most recent snapshot before the log segment and ending with the most recent snapshot before the end of the log segment. The user then check the authenticators for tampering. If this step proceeds, the user can assume the log segment executed properly. If the machine is faulty, the segment will be unavaible to download or may return a corrupted log segment. This can be used to convince a third party of the fault.

When the user wants to verify the snapshot, the user obtains a snapshot of the AVM's state at the beginning of the log segment. The user then downloads a snapshot from the machine and the AVMM recomputes the hash tree. The new hash tree is compared to the hash tree contained in the orignal log segment. If any discrepancies are detected, the user can use this to convince a third party of machine's fault.

In order for the user to verifying the execution of a log segment, the user needs three inputs: the log segment, the snapshot, and the public keys of the machine and any users of the machine. The auditing tool performs two checks on the log segment, a syntactic check (determines if log is well-formed), and a semantic check (determines if the information in the log shows the correct execution of the machine).

The syntactic check checks whether all log entries are in the proper format, the signatures in each message and acknowledgement, if each message was acknowledged, and the sequence of sent and received messages is correct when compared to the sequence of messages that enter and exit the AVM.

The semantic check creates a local VM that will execute the machine's log segment, the VM is initialized with a snapshot from the machine if possible. The local VM then runs the log segment and the data is recorded. The auditing tool then checks the log segments, inputs, outputs, and verification of snapshot hashes of the replayed execution against the original log. If any discrepancies are detected then the fault is reported and can be used as evidence as fault.

Why is it better?
[To Do]

==Critique==

What is good and not-so-good about this paper? You may discuss both the style and content; be sure to ground your discussion with specific references. Simple assertions that something is good or bad is not enough - you must explain why.

// first part of my writing; this is just part1 [[User:Sschnei1|Sschnei1]] 00:35, 24 November 2010 (UTC)

For the comprehension of the reader, it is important of a paper/article/essay to have a good overview/layout. The introduction clearly describes what the reader has to expect in the following pages, especially what problems are addressed and how they are solved.

This paper gives multiple examples about advantages and disadvantages in an AVM. A good example is "Cheat Detection". Cheaters use programs to go around the original game code to gain an major advantage over other players. Since an AVM is generic in cheat detection it has a wider support for detecting cheats than most of the other cheat detection algorithms. The logs give the game the function to replay the game. Thus, players using AVM can see the way other players play by replaying the game with the player's log.

The negative side is that the player might have to suffer from the AVM. Everything is being logged and stored on the hard drive, which takes a lot amount of space. In the example in the paper it is 148mb per hour after compression. This reduces the fps. Additionally, the connection to the AVM increases the ping time to the server.

The test case for the AVM was using it to detect people using cheats in the popular online game Counter-Strike. They were using “Dell Precision T1500 workstations, with 8 GB of memory and 2.8 GHz Intel Core i7 860 CPUs”[pg 10]. These machines are considerably more high powered than the system requirements of Counter-Strike, which are “500 MHz processor, 96 MB RAM”[10]. A 10 year old game [10] should use fewer resources on a Dell Precision T1500 workstations. In comparison, newer games consume far more resources than Counter-Strike giving it less room to run the AVM. A 13% slowdown [pg 12.] in a game where you are only getting 30 to 40 fps is a pretty noticeable slowdown. This is very detrimental to the game play because having over 60fps is the optimal performance.

In the paper the authors state that the AVM will only generate an extra 5ms of latency. While this does not seem like a lot the measurement was taken over a LAN with all the computers connected to the same switch [pg. 12]. This sample does not accurately represent real life situations and therefore lacks external validity, since many of these online games are played over the internet with the participants sometimes not even on the same continent; the latency overhead of the AVM would certainly increase due to the added distance. [12]

==References==

You will almost certainly have to refer to other resources; please cite these resources in the style of citation of the papers assigned (inlined numbered references). Place your bibliographic entries in this section.

[1] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and
A. Warfield. Remus: High availability via asynchronous virtual
machine replication. In Proceedings of the USENIX Symposium
on Networked Systems Design and Implementation (NSDI), Apr.
2008.

[2] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[3] G. Hoglund. 4.5 million copies of EULA-compliant spyware.
http://www.rootkit.com/blog.php?newsid=358.

[4] PunkBuster web site. http://www.evenbalance.com/.

[5] N. E. Baughman, M. Liberatore, and B. N. Levine. Cheat-proof
playout for centralized and peer-to-peer gaming. IEEE/ACM
Transactions on Networking (ToN), 15(1):1–13, Feb. 2007.

[6] C. M¨onch, G. Grimen, and R. Midtstraum. Protecting online
games against cheating. In Proceedings of the Workshop on Network
and Systems Support for Games (NetGames), Oct. 2006.

[7] A. Haeberlen, P. Kuznetsov, and P. Druschel. PeerReview: Practical
accountability for distributed systems. In Proceedings of
the ACM Symposium on Operating Systems Principles (SOSP),Oct. 2007.

[8] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[9] VMWare Workstation 6.5.1 web site. http://www.vmware.com/products/workstation/

[10] Counter-Strike http://store.steampowered.com/app/10/

[12] Larry L. Peterson and Bruce S. Davie. Computer Networks a Systems Approach, 2007

=Discussion=
We can use this area to discuss or leave notes on general ideas or whatever you want to write here.

-The current due date posted on the site for this essay is November 25th --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

-I think that since we are given the headings to this article, we can easily choose what parts each member would like to work on, obviously since there are more members than parts, multiple members will have to work on the same parts or can work on all parts, I guess it's really up to you. I know that most people have a lot of projects coming up so let's try to get this done asap, or at least bit by bit so it's not something we have to worry too much about. --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

- I would like to do the Contribution or Critique. -- [[User:Sschnei1|Sschnei1]] 02:40, 20 November 2010 (UTC)

- I can either work on Background Concepts, or Research problem. -[[User:Jbaubin|Jbaubin]]

- I'm not sure whether the background concepts should be in point form or a paragraph, and whether it needs to be very long or not, but I shall work on both background concepts and research problem with you Jbaubin. --[[User:Mchou2|Mchou2]] 18:11, 21 November 2010 (UTC)

-Sounds good, and As i was going to post what I had for research problem, I just saw you posted a big chunk of it. I'll be out for a while, but tonight I'll take a serious look at what you write and add what I had written. - [[User:Jbaubin|Jbaubin]]

- Sorry I didn't write anything yet to Critique. I'm making my notes and will post something tonight or tomorrow. -- [[User:Sschnei1|Sschnei1]] 14:50, 22 November 2010 (UTC)

- I have started work on the contribution section. I'll have something up today or tomorrow. --[[User:Hirving|Hirving]] 19:55, 23 November 2010 (UTC)

-if anyone has information that they are working on they can just post it up and at least others can look at it and maybe build up stuff on it, and I'm sure everyone is aware of the extension that we got also, but let's try to finish this in the next few days --[[User:Mchou2|Mchou2]] 20:43, 23 November 2010 (UTC)

- I agree with finishing it in the next few days. Then we have more time to focus on other courses like 3004. I will post something later that night. -- [[User:Sschnei1|Sschnei1]] 21:29, 23 November 2010 (UTC)

- Just added my contribution section, can someone proof read and sign it before I move it over to the essay. I didn't do the "why is it better" part because I found the implementation took a lot of writing. For anyone that wants to do the other part, I'd suggest comparing AVMs to PunkBuster and/or VAC, and a cloud computing service (focusing on the auditing). Cheers --[[User:Hirving|Hirving]] 19:44, 24 November 2010 (UTC)

- I started that what is better/worse part in the Critique section. I will add the comparison with AVMs to Punkbuster and/or VAC soon. I personally feel like there is not that much to write for the Critique section. -- [[User:Sschnei1|Sschnei1]] 20:39, 24 November 2010 (UTC)

-Hay. I got a bit to add to your Critique section section. Its mostly expanding on your last paragraph and a bit on how the tests were performed. ill post my stuff later tonight, I just need to find some sources for my argument.--[[User:Pcox|Pcox]] 01:06, 25 November 2010 (UTC)

Talk:COMP 3000 Essay 2 2010 Question 4

2010-11-25T05:36:08Z

Pcox: /* References */

= Group Essay 2 =

Hello Group. Please post your information here. I assume everybody read the email at your connect account. Anyone specific wants to send him the email with the group members inside? If not, I just go ahead tomorrow at about 13:00 and send the email with the group members who wrote their contact information in here. - [[User:Sschnei1|Sschnei1]] 03:25, 15 November 2010 (UTC)
<br />

Sebastian Schneider sschnei1@connect.carleton.ca

Matthew Chou mchou2@connect.carleton.ca

Mark Walts mwalts@connect.carleton.ca

Henry Irving hirving@connect.carleton.ca

Jean-Benoit Aubin jbaubin@connect.carleton.ca

Pradhan Nishant npradhan npradhan@connect.carleton.ca

Only Paul Cox didn't answer i sent this morning.

Cox Paul pcox

And I just sent an email to the teacher.

--Jean-Benoit

==Paper==

the paper's title, authors, and their affiliations. Include a link to the paper and any particularly helpful supplementary information.

'''Title:''' Accountable Virtual Machines

'''Authors:''' Andreas Haeberlen, Paarijaat Aditya, Rodrigo Rodrigues, Peter Druschel

'''Affiliates:'''
University of Pennsylvania, Max Planck Institute for Software Systems (MPI-SWS)]

'''Link to Paper:''' [http://www.usenix.org/events/osdi10/tech/full_papers/Haeberlen.pdf Accountable Virtual Machines]

'''Supplementary Information:''' [http://research.microsoft.com/en-us/people/sriram/druschel.pptx Accountable distributed systems and the accountable cloud] - background of similar AVM implementation for distributed systems.

==Background Concepts==

Explain briefly the background concepts and ideas that your fellow classmates will need to know first in order to understand your assigned paper.

'''Accountable Virtual Machine (AVM)'''

'''Deterministic Replay''': A machine can record its executions into a file so that it can be replayed in order to see the executions and follow what was happening on the machine. Remus [[#References | [1]]] has contributed a highly efficient snap-shotting mechanism for these replays.

'''Accountability:''' Accountability in the context of this paper means that every action done on the virtual machine is recorded and will be used against the machine or user to verify the correctness of the application. The AVM is responsible of its action and will answers for its action against an auditor.

'''Remote Fault Detection:''' There are programs like GridCop[[#References | [2]]] that can be used to monitor the progress and execution of a remotely executing program by requesting a beacon packet. When the remote computer is sending the packets, the receiving/logging computer must be a trusted computer (hardware,software, OS) so that the receiving of packets remains consistent. To detect a fault in a remote system, every packet must arrive safely, and any interrupts during the logging must be handled or the inconsistencies will result in an inaccurate outcome. The AVM does not require trusted hardware and can be used over wide-area networks.

'''Cheat Detection:''' Cheating in games or any specific modification in a program can be either scanned[[#References | [3][4]]] for or prevented[[#References | [5][6]]] by certain programs. The issue with these scanning and preventative software is the knowledge/awareness of specific cheats or situations that the software can handle. An AVM is designed to counter any kind of general cheat.

'''Integrity Violations:''' This refers how the consistency of normal/expected operations of an execution does not equal to that of the host/reference (Trusted) execution, hence a violation has occurred.

- The word "node" is used to refer to a computer or server in order to represent the interactions between one computer and another, or a computer and a server.

==Research problem==

What is the research problem being addressed by the paper? How does this problem relate to past related work?
**Possible alternative for the first part :

The research presented in this paper tries to tackle a problem that has haunted computer scientists for a long time. How can you be sure that the software running on a remote machine is working correctly or as intended. Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a trust relation between users and a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done would be independent of the node and only dependent on the intended software. Let's say, that node A interacts with node B with execution exe1 and node A interacts with node C also with ex1, but node C has been modified and respond with exe2. Thus, we can assume that the respond of B and C will be different. Being able to prove that the node C has been modified without any doubt is the purpose of this paper.
***Let me know what you think about it. I removed the redundant part, and I think made it clearer and more concise. [[User:Jbaubin|Jbaubin]]

** looks good to me, we'll put this part into the final essay instead of mine below --[[User:Mchou2|Mchou2]] 20:03, 22 November 2010 (UTC)

/// omit

Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a system of trust between users and a host. These different examples must have a certain amount of trust between the interactions of one user and another, as well as the user interacting with a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done with node A is the same it would be done with another node, node B. Let's say for example that node A interacts with node B with execution exe1, now when node A and B interact with node C, they would both expect to interact with execution exe1, but what happens if node C interacts differently and executes with exe2, then it would be beneficial to be notified of this difference. The previous explanation might not seem too relevant without some examples, such as; Node A is playing a game with node B, the game executed on node B is the same as on A, now when node A plays with node C, node C is executing the same operations as node A plus a cheating program; when node A buys some products from node B's server, the server processes the order and then deletes node A's sensitive information, denoted by execution 1, now when node A buys from node C's server, the order is processed as well as the sensitive information that node A has provided is also rerouted to another server so that it can be used without permission. These are only a few examples where the operations in an execution is necessary to be logged and verified. The problem that is trying to be handled here is to create a procedure that can be done so that a node can be known as accountable, and to log the operations in an execution to provide evidence of these faults done by a node.

////

Previous work that has been done in efforts to prevent or detect integrity violations can be separated into different categories of operations. The first would be Cheat Detection, where in many different games there are cheats that users use to usually create benefits for themselves that was not intended by the original game.[[#References |[4]]] These detectors are not dynamic, in the sense that they do not actually detect whether a cheat is being used, more so they are checking if there is a cheating operation that they have logged before, being operated on the user's system. For example, if there was a known cheating program named aimbot.exe that can be run in the background of a game such as CounterStrike, and the PunkBuster system that was implemented on the user's system had the aimbot.exe program already logged as a cheating program from the developers, the PunkBuster program might notify the current game servers of this or even prevent the user from playing any games until the aimbot.exe operation is no longer running.

Accountability is another important problem that many have already worked on. The main goal of an accountable system is to be able to determine without a doubt that node is faulty and can prove it with solid evidence. It can also be used to defend a node when threatened with false accusation. Numerous systems already use accountability in their system, but they were mostly all linked to specific applications, where a point of reference must be used to compare. As example PeerReview[[#References |[7]]], which is a system closely related to what the research team have worked on, must be implemented into the application which makes it less portable and cannot be implemented as easily as an AVM. PeerReview verifies the inbound and outbound packets and can see if the software is running as intended.

Another problem that is related to the paper is remote fault detection in a distributed system. How can we determine if a remote node is running the code correctly or if the machine itself is working as intended. Network activity is a common solution to this problem, as they look at the inbound and outbound of the node. This can let them know how the software is operating, or in the case of AVM how the whole virtual machine is working. Gridcop[[#References |[8]]] is another example that inspects a small number of packets periodically. Another way of determining the fault remotely is to use a trusted node, where it can tell immediately if a fault occurs or a modification is made where it should not have been made.

-and anything else you would to add or modify, or leave a note in the discussion sections if you want me to relook or change something. --[[User:Mchou2|Mchou2]] 20:10, 21 November 2010 (UTC)

The problem of logging and auditing the processes of an execution of a specific node (computer) is greatly dependent on the work done for deterministic replay. Deterministic replay programs can create a log file that can be used to replay the operations done for some execution that occurs on a node. Replaying the operations done on the node can show what the node was doing, and this would seem like it is sufficient in finding out whether a node was causing integrity violations or not. The concept of snap-shoting/recording the operations is not the issue with deterministic replay, it is the fact that the data being outputted into the replay may be tampered with by the node itself so that it generates optimal results in replay. By faking the results of the operations, the auditing computer will falsely believe that the tested computer is running all operations as normal. The logging operations done by these recording programs can be directly related to the work needed to detect integrity violations.

==Contribution==

What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)

The accountable virtual machine (AVM), that was proposed in this essay, most useful contribution was the implementation of the accountable virtual machine monitor (AVMM). It is what allows for the fault checking of virtual machines in a cloud computing environment. The AVMM can be broken down into different parts: the virtual machine monitor (VMM), the temper-evident log, and auditing mechanisms. The VMM is based off the VMM found in VMWare Workstation 6.5.1[[#References |[9]]], the temper-evident log was adapted from code in PeerReview[[#References |[7]]], and the audit tools were built up from scratch.

The accountable virtual machine monitor relies on four assumptions:

1. All transmitted messages are received, if retransmitted sufficiently often.

2. Machines and Users have access to a has function that is pre-image resistant, second pre-image resistant, and collision resistant.

3. All parties have a certified keypair, that can be used to sign messages.

4. To audit a log, the user has a reference copy of the VM used.
The job of the AVMM is to record all incoming and outgoing messages to tamper-evident log.
and enough info of the execution to enable deterministic replay.

The AVMM must record nondeterministic inputs (such as hardware interrupts), because the input is asynchronous, the exact timing of input must be recorded, so that the inputs can be injected at the same points during replay. Wall-clock time is not accurate enough for this recording, so the AVMM must use a combination of instruction pointer, branch counter, and, possibly, additional registers. Not all inputs have to be recorded this way (software interrupts), because, they send requests to the AVM, which will be issued again during replay.

Two parallel streams appear in the tamper-evident log: message exchanges and nondeterministic inputs.
It is important for the AVMM to detect inconsistencies between the user's log and the machine's log (in case of foul play), so the AVMM simply cross-references messages and inputs during replay, thus, easily detecting any discrepancies.

The AVMM periodically takes snapshots of the AVM's current state, this facilitates fine-grain audits for the user, but it also increases overhead. The overhead is lowered slightly by the snapshots being incremental (only save the state that has been changed since the last snapshot). The user can authenticate the snapshot using a hash tree of the state (generated by the AVMM), the AVMM updates the hash tree after each snapshot.

'''Tamper-Evident Log'''

The log is made up of hash code entries.
Each log entry in form e = (s,t,c,h)
s = monotonically increasing sequence number
t = type
c = data of the type
h = hash value

The hash value is calculated by: h = H(hi-1 || s || t || H(c))
H() is a hash function.
|| stands for concatenation

Each message sent gets signed with a private key, when the AVMM logs the messages with the signature attached but removes it before sending it to the AVM. To ensure nonrepudiation, an authenticator is attached to each outgoing message.

To detect when a message is dropped, each party sends an acknowledgement for each message they receive. If an acknowledgement is not received the message is resent a few times, if the user stops receiving messages, then the machine is presumed to of failed.

To preform a log check, the user retrieves a pair of authenticators, then challenges the machine to produce the log segment between the two. The log is computationally infeasible to edit without breaking the hash chain, thus, if the log has been tampered with the hash chain will be different and the user will notified of the tampering.

'''Auditing Mechanism'''

From VVM's perspective all things are deterministic.

To preform a audit the user:

1. obtains a segment of the machine's log and the authenticators

2. downloads a snapshot of the AVM at the beginning of the segment

3. replays the entire segment, starting from the snapshot, to verify the events in the log are the correct execution of the software.

The user can verify the execution of software through three different methods: Verifying the log, snapshot, and execution.

When the user wants to verify a log segment, the user retrieves the authenticators from the machine with the sequence numbers in the range of the log segment. The user then downloads the log segment from the machine, and, starting with the most recent snapshot before the log segment and ending with the most recent snapshot before the end of the log segment. The user then check the authenticators for tampering. If this step proceeds, the user can assume the log segment executed properly. If the machine is faulty, the segment will be unavaible to download or may return a corrupted log segment. This can be used to convince a third party of the fault.

When the user wants to verify the snapshot, the user obtains a snapshot of the AVM's state at the beginning of the log segment. The user then downloads a snapshot from the machine and the AVMM recomputes the hash tree. The new hash tree is compared to the hash tree contained in the orignal log segment. If any discrepancies are detected, the user can use this to convince a third party of machine's fault.

In order for the user to verifying the execution of a log segment, the user needs three inputs: the log segment, the snapshot, and the public keys of the machine and any users of the machine. The auditing tool performs two checks on the log segment, a syntactic check (determines if log is well-formed), and a semantic check (determines if the information in the log shows the correct execution of the machine).

The syntactic check checks whether all log entries are in the proper format, the signatures in each message and acknowledgement, if each message was acknowledged, and the sequence of sent and received messages is correct when compared to the sequence of messages that enter and exit the AVM.

The semantic check creates a local VM that will execute the machine's log segment, the VM is initialized with a snapshot from the machine if possible. The local VM then runs the log segment and the data is recorded. The auditing tool then checks the log segments, inputs, outputs, and verification of snapshot hashes of the replayed execution against the original log. If any discrepancies are detected then the fault is reported and can be used as evidence as fault.

Why is it better?
[To Do]

==Critique==

What is good and not-so-good about this paper? You may discuss both the style and content; be sure to ground your discussion with specific references. Simple assertions that something is good or bad is not enough - you must explain why.

// first part of my writing; this is just part1 [[User:Sschnei1|Sschnei1]] 00:35, 24 November 2010 (UTC)

For the comprehension of the reader, it is important of a paper/article/essay to have a good overview/layout. The introduction clearly describes what the reader has to expect in the following pages, especially what problems are addressed and how they are solved.

This paper gives multiple examples about advantages and disadvantages in an AVM. A good example is "Cheat Detection". Cheaters use programs to go around the original game code to gain an major advantage over other players. Since an AVM is generic in cheat detection it has a wider support for detecting cheats than most of the other cheat detection algorithms. The logs give the game the function to replay the game. Thus, players using AVM can see the way other players play by replaying the game with the player's log.

The negative side is that the player might have to suffer from the AVM. Everything is being logged and stored on the hard drive, which takes a lot amount of space. In the example in the paper it is 148mb per hour after compression. This reduces the fps. Additionally, the connection to the AVM increases the ping time to the server.

The test case for the AVM was using it to detect people using cheats in the popular online game Counter-Strike. They were using “Dell Precision T1500 workstations, with 8 GB of memory and 2.8 GHz Intel Core i7 860 CPUs”[pg 10]. These machines are considerably more high powered than the system requirements of Counter-Strike, which are “500 MHz processor, 96 MB RAM”[reference?]. A 10 year old game [reference?] should use fewer resources on a Dell Precision T1500 workstations. In comparison, newer games consume far more resources than Counter-Strike giving it less room to run the AVM. A 13% slowdown [pg 12.] in a game where you are only getting 30 to 40 fps is a pretty noticeable slowdown. This is very detrimental to the game play because having over 60fps is the optimal performance.

In the paper the authors state that the AVM will only generate an extra 5ms of latency. While this does not seem like a lot the measurement was taken over a LAN with all the computers connected to the same switch [pg 12]. This sample does not accurately represent real life situations and therefore lacks external validity, since many of these online games are played over the internet with the participants sometimes not even on the same continent; the latency overhead of the AVM would certainly increase due to the added distance. [networking textbook pg.41-42]

==References==

You will almost certainly have to refer to other resources; please cite these resources in the style of citation of the papers assigned (inlined numbered references). Place your bibliographic entries in this section.

[1] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and
A. Warfield. Remus: High availability via asynchronous virtual
machine replication. In Proceedings of the USENIX Symposium
on Networked Systems Design and Implementation (NSDI), Apr.
2008.

[2] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[3] G. Hoglund. 4.5 million copies of EULA-compliant spyware.
http://www.rootkit.com/blog.php?newsid=358.

[4] PunkBuster web site. http://www.evenbalance.com/.

[5] N. E. Baughman, M. Liberatore, and B. N. Levine. Cheat-proof
playout for centralized and peer-to-peer gaming. IEEE/ACM
Transactions on Networking (ToN), 15(1):1–13, Feb. 2007.

[6] C. M¨onch, G. Grimen, and R. Midtstraum. Protecting online
games against cheating. In Proceedings of the Workshop on Network
and Systems Support for Games (NetGames), Oct. 2006.

[7] A. Haeberlen, P. Kuznetsov, and P. Druschel. PeerReview: Practical
accountability for distributed systems. In Proceedings of
the ACM Symposium on Operating Systems Principles (SOSP),Oct. 2007.

[8] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[9] VMWare Workstation 6.5.1 web site. http://www.vmware.com/products/workstation/

[10] Counter-Strike http://store.steampowered.com/app/10/

[12] Larry L. Peterson and Bruce S. Davie. Computer Networks a Systems Approach, 2007

=Discussion=
We can use this area to discuss or leave notes on general ideas or whatever you want to write here.

-The current due date posted on the site for this essay is November 25th --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

-I think that since we are given the headings to this article, we can easily choose what parts each member would like to work on, obviously since there are more members than parts, multiple members will have to work on the same parts or can work on all parts, I guess it's really up to you. I know that most people have a lot of projects coming up so let's try to get this done asap, or at least bit by bit so it's not something we have to worry too much about. --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

- I would like to do the Contribution or Critique. -- [[User:Sschnei1|Sschnei1]] 02:40, 20 November 2010 (UTC)

- I can either work on Background Concepts, or Research problem. -[[User:Jbaubin|Jbaubin]]

- I'm not sure whether the background concepts should be in point form or a paragraph, and whether it needs to be very long or not, but I shall work on both background concepts and research problem with you Jbaubin. --[[User:Mchou2|Mchou2]] 18:11, 21 November 2010 (UTC)

-Sounds good, and As i was going to post what I had for research problem, I just saw you posted a big chunk of it. I'll be out for a while, but tonight I'll take a serious look at what you write and add what I had written. - [[User:Jbaubin|Jbaubin]]

- Sorry I didn't write anything yet to Critique. I'm making my notes and will post something tonight or tomorrow. -- [[User:Sschnei1|Sschnei1]] 14:50, 22 November 2010 (UTC)

- I have started work on the contribution section. I'll have something up today or tomorrow. --[[User:Hirving|Hirving]] 19:55, 23 November 2010 (UTC)

-if anyone has information that they are working on they can just post it up and at least others can look at it and maybe build up stuff on it, and I'm sure everyone is aware of the extension that we got also, but let's try to finish this in the next few days --[[User:Mchou2|Mchou2]] 20:43, 23 November 2010 (UTC)

- I agree with finishing it in the next few days. Then we have more time to focus on other courses like 3004. I will post something later that night. -- [[User:Sschnei1|Sschnei1]] 21:29, 23 November 2010 (UTC)

- Just added my contribution section, can someone proof read and sign it before I move it over to the essay. I didn't do the "why is it better" part because I found the implementation took a lot of writing. For anyone that wants to do the other part, I'd suggest comparing AVMs to PunkBuster and/or VAC, and a cloud computing service (focusing on the auditing). Cheers --[[User:Hirving|Hirving]] 19:44, 24 November 2010 (UTC)

- I started that what is better/worse part in the Critique section. I will add the comparison with AVMs to Punkbuster and/or VAC soon. I personally feel like there is not that much to write for the Critique section. -- [[User:Sschnei1|Sschnei1]] 20:39, 24 November 2010 (UTC)

-Hay. I got a bit to add to your Critique section section. Its mostly expanding on your last paragraph and a bit on how the tests were performed. ill post my stuff later tonight, I just need to find some sources for my argument.--[[User:Pcox|Pcox]] 01:06, 25 November 2010 (UTC)

Talk:COMP 3000 Essay 2 2010 Question 4

2010-11-25T05:22:57Z

Pcox: /* Critique */

= Group Essay 2 =

Hello Group. Please post your information here. I assume everybody read the email at your connect account. Anyone specific wants to send him the email with the group members inside? If not, I just go ahead tomorrow at about 13:00 and send the email with the group members who wrote their contact information in here. - [[User:Sschnei1|Sschnei1]] 03:25, 15 November 2010 (UTC)
<br />

Sebastian Schneider sschnei1@connect.carleton.ca

Matthew Chou mchou2@connect.carleton.ca

Mark Walts mwalts@connect.carleton.ca

Henry Irving hirving@connect.carleton.ca

Jean-Benoit Aubin jbaubin@connect.carleton.ca

Pradhan Nishant npradhan npradhan@connect.carleton.ca

Only Paul Cox didn't answer i sent this morning.

Cox Paul pcox

And I just sent an email to the teacher.

--Jean-Benoit

==Paper==

the paper's title, authors, and their affiliations. Include a link to the paper and any particularly helpful supplementary information.

'''Title:''' Accountable Virtual Machines

'''Authors:''' Andreas Haeberlen, Paarijaat Aditya, Rodrigo Rodrigues, Peter Druschel

'''Affiliates:'''
University of Pennsylvania, Max Planck Institute for Software Systems (MPI-SWS)]

'''Link to Paper:''' [http://www.usenix.org/events/osdi10/tech/full_papers/Haeberlen.pdf Accountable Virtual Machines]

'''Supplementary Information:''' [http://research.microsoft.com/en-us/people/sriram/druschel.pptx Accountable distributed systems and the accountable cloud] - background of similar AVM implementation for distributed systems.

==Background Concepts==

Explain briefly the background concepts and ideas that your fellow classmates will need to know first in order to understand your assigned paper.

'''Accountable Virtual Machine (AVM)'''

'''Deterministic Replay''': A machine can record its executions into a file so that it can be replayed in order to see the executions and follow what was happening on the machine. Remus [[#References | [1]]] has contributed a highly efficient snap-shotting mechanism for these replays.

'''Accountability:''' Accountability in the context of this paper means that every action done on the virtual machine is recorded and will be used against the machine or user to verify the correctness of the application. The AVM is responsible of its action and will answers for its action against an auditor.

'''Remote Fault Detection:''' There are programs like GridCop[[#References | [2]]] that can be used to monitor the progress and execution of a remotely executing program by requesting a beacon packet. When the remote computer is sending the packets, the receiving/logging computer must be a trusted computer (hardware,software, OS) so that the receiving of packets remains consistent. To detect a fault in a remote system, every packet must arrive safely, and any interrupts during the logging must be handled or the inconsistencies will result in an inaccurate outcome. The AVM does not require trusted hardware and can be used over wide-area networks.

'''Cheat Detection:''' Cheating in games or any specific modification in a program can be either scanned[[#References | [3][4]]] for or prevented[[#References | [5][6]]] by certain programs. The issue with these scanning and preventative software is the knowledge/awareness of specific cheats or situations that the software can handle. An AVM is designed to counter any kind of general cheat.

'''Integrity Violations:''' This refers how the consistency of normal/expected operations of an execution does not equal to that of the host/reference (Trusted) execution, hence a violation has occurred.

- The word "node" is used to refer to a computer or server in order to represent the interactions between one computer and another, or a computer and a server.

==Research problem==

What is the research problem being addressed by the paper? How does this problem relate to past related work?
**Possible alternative for the first part :

The research presented in this paper tries to tackle a problem that has haunted computer scientists for a long time. How can you be sure that the software running on a remote machine is working correctly or as intended. Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a trust relation between users and a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done would be independent of the node and only dependent on the intended software. Let's say, that node A interacts with node B with execution exe1 and node A interacts with node C also with ex1, but node C has been modified and respond with exe2. Thus, we can assume that the respond of B and C will be different. Being able to prove that the node C has been modified without any doubt is the purpose of this paper.
***Let me know what you think about it. I removed the redundant part, and I think made it clearer and more concise. [[User:Jbaubin|Jbaubin]]

** looks good to me, we'll put this part into the final essay instead of mine below --[[User:Mchou2|Mchou2]] 20:03, 22 November 2010 (UTC)

/// omit

Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a system of trust between users and a host. These different examples must have a certain amount of trust between the interactions of one user and another, as well as the user interacting with a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done with node A is the same it would be done with another node, node B. Let's say for example that node A interacts with node B with execution exe1, now when node A and B interact with node C, they would both expect to interact with execution exe1, but what happens if node C interacts differently and executes with exe2, then it would be beneficial to be notified of this difference. The previous explanation might not seem too relevant without some examples, such as; Node A is playing a game with node B, the game executed on node B is the same as on A, now when node A plays with node C, node C is executing the same operations as node A plus a cheating program; when node A buys some products from node B's server, the server processes the order and then deletes node A's sensitive information, denoted by execution 1, now when node A buys from node C's server, the order is processed as well as the sensitive information that node A has provided is also rerouted to another server so that it can be used without permission. These are only a few examples where the operations in an execution is necessary to be logged and verified. The problem that is trying to be handled here is to create a procedure that can be done so that a node can be known as accountable, and to log the operations in an execution to provide evidence of these faults done by a node.

////

Previous work that has been done in efforts to prevent or detect integrity violations can be separated into different categories of operations. The first would be Cheat Detection, where in many different games there are cheats that users use to usually create benefits for themselves that was not intended by the original game.[[#References |[4]]] These detectors are not dynamic, in the sense that they do not actually detect whether a cheat is being used, more so they are checking if there is a cheating operation that they have logged before, being operated on the user's system. For example, if there was a known cheating program named aimbot.exe that can be run in the background of a game such as CounterStrike, and the PunkBuster system that was implemented on the user's system had the aimbot.exe program already logged as a cheating program from the developers, the PunkBuster program might notify the current game servers of this or even prevent the user from playing any games until the aimbot.exe operation is no longer running.

Accountability is another important problem that many have already worked on. The main goal of an accountable system is to be able to determine without a doubt that node is faulty and can prove it with solid evidence. It can also be used to defend a node when threatened with false accusation. Numerous systems already use accountability in their system, but they were mostly all linked to specific applications, where a point of reference must be used to compare. As example PeerReview[[#References |[7]]], which is a system closely related to what the research team have worked on, must be implemented into the application which makes it less portable and cannot be implemented as easily as an AVM. PeerReview verifies the inbound and outbound packets and can see if the software is running as intended.

Another problem that is related to the paper is remote fault detection in a distributed system. How can we determine if a remote node is running the code correctly or if the machine itself is working as intended. Network activity is a common solution to this problem, as they look at the inbound and outbound of the node. This can let them know how the software is operating, or in the case of AVM how the whole virtual machine is working. Gridcop[[#References |[8]]] is another example that inspects a small number of packets periodically. Another way of determining the fault remotely is to use a trusted node, where it can tell immediately if a fault occurs or a modification is made where it should not have been made.

-and anything else you would to add or modify, or leave a note in the discussion sections if you want me to relook or change something. --[[User:Mchou2|Mchou2]] 20:10, 21 November 2010 (UTC)

The problem of logging and auditing the processes of an execution of a specific node (computer) is greatly dependent on the work done for deterministic replay. Deterministic replay programs can create a log file that can be used to replay the operations done for some execution that occurs on a node. Replaying the operations done on the node can show what the node was doing, and this would seem like it is sufficient in finding out whether a node was causing integrity violations or not. The concept of snap-shoting/recording the operations is not the issue with deterministic replay, it is the fact that the data being outputted into the replay may be tampered with by the node itself so that it generates optimal results in replay. By faking the results of the operations, the auditing computer will falsely believe that the tested computer is running all operations as normal. The logging operations done by these recording programs can be directly related to the work needed to detect integrity violations.

==Contribution==

What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)

The accountable virtual machine (AVM), that was proposed in this essay, most useful contribution was the implementation of the accountable virtual machine monitor (AVMM). It is what allows for the fault checking of virtual machines in a cloud computing environment. The AVMM can be broken down into different parts: the virtual machine monitor (VMM), the temper-evident log, and auditing mechanisms. The VMM is based off the VMM found in VMWare Workstation 6.5.1[[#References |[9]]], the temper-evident log was adapted from code in PeerReview[[#References |[7]]], and the audit tools were built up from scratch.

The accountable virtual machine monitor relies on four assumptions:

1. All transmitted messages are received, if retransmitted sufficiently often.

2. Machines and Users have access to a has function that is pre-image resistant, second pre-image resistant, and collision resistant.

3. All parties have a certified keypair, that can be used to sign messages.

4. To audit a log, the user has a reference copy of the VM used.
The job of the AVMM is to record all incoming and outgoing messages to tamper-evident log.
and enough info of the execution to enable deterministic replay.

The AVMM must record nondeterministic inputs (such as hardware interrupts), because the input is asynchronous, the exact timing of input must be recorded, so that the inputs can be injected at the same points during replay. Wall-clock time is not accurate enough for this recording, so the AVMM must use a combination of instruction pointer, branch counter, and, possibly, additional registers. Not all inputs have to be recorded this way (software interrupts), because, they send requests to the AVM, which will be issued again during replay.

Two parallel streams appear in the tamper-evident log: message exchanges and nondeterministic inputs.
It is important for the AVMM to detect inconsistencies between the user's log and the machine's log (in case of foul play), so the AVMM simply cross-references messages and inputs during replay, thus, easily detecting any discrepancies.

The AVMM periodically takes snapshots of the AVM's current state, this facilitates fine-grain audits for the user, but it also increases overhead. The overhead is lowered slightly by the snapshots being incremental (only save the state that has been changed since the last snapshot). The user can authenticate the snapshot using a hash tree of the state (generated by the AVMM), the AVMM updates the hash tree after each snapshot.

'''Tamper-Evident Log'''

The log is made up of hash code entries.
Each log entry in form e = (s,t,c,h)
s = monotonically increasing sequence number
t = type
c = data of the type
h = hash value

The hash value is calculated by: h = H(hi-1 || s || t || H(c))
H() is a hash function.
|| stands for concatenation

Each message sent gets signed with a private key, when the AVMM logs the messages with the signature attached but removes it before sending it to the AVM. To ensure nonrepudiation, an authenticator is attached to each outgoing message.

To detect when a message is dropped, each party sends an acknowledgement for each message they receive. If an acknowledgement is not received the message is resent a few times, if the user stops receiving messages, then the machine is presumed to of failed.

To preform a log check, the user retrieves a pair of authenticators, then challenges the machine to produce the log segment between the two. The log is computationally infeasible to edit without breaking the hash chain, thus, if the log has been tampered with the hash chain will be different and the user will notified of the tampering.

'''Auditing Mechanism'''

From VVM's perspective all things are deterministic.

To preform a audit the user:

1. obtains a segment of the machine's log and the authenticators

2. downloads a snapshot of the AVM at the beginning of the segment

3. replays the entire segment, starting from the snapshot, to verify the events in the log are the correct execution of the software.

The user can verify the execution of software through three different methods: Verifying the log, snapshot, and execution.

When the user wants to verify a log segment, the user retrieves the authenticators from the machine with the sequence numbers in the range of the log segment. The user then downloads the log segment from the machine, and, starting with the most recent snapshot before the log segment and ending with the most recent snapshot before the end of the log segment. The user then check the authenticators for tampering. If this step proceeds, the user can assume the log segment executed properly. If the machine is faulty, the segment will be unavaible to download or may return a corrupted log segment. This can be used to convince a third party of the fault.

When the user wants to verify the snapshot, the user obtains a snapshot of the AVM's state at the beginning of the log segment. The user then downloads a snapshot from the machine and the AVMM recomputes the hash tree. The new hash tree is compared to the hash tree contained in the orignal log segment. If any discrepancies are detected, the user can use this to convince a third party of machine's fault.

In order for the user to verifying the execution of a log segment, the user needs three inputs: the log segment, the snapshot, and the public keys of the machine and any users of the machine. The auditing tool performs two checks on the log segment, a syntactic check (determines if log is well-formed), and a semantic check (determines if the information in the log shows the correct execution of the machine).

The syntactic check checks whether all log entries are in the proper format, the signatures in each message and acknowledgement, if each message was acknowledged, and the sequence of sent and received messages is correct when compared to the sequence of messages that enter and exit the AVM.

The semantic check creates a local VM that will execute the machine's log segment, the VM is initialized with a snapshot from the machine if possible. The local VM then runs the log segment and the data is recorded. The auditing tool then checks the log segments, inputs, outputs, and verification of snapshot hashes of the replayed execution against the original log. If any discrepancies are detected then the fault is reported and can be used as evidence as fault.

Why is it better?
[To Do]

==Critique==

What is good and not-so-good about this paper? You may discuss both the style and content; be sure to ground your discussion with specific references. Simple assertions that something is good or bad is not enough - you must explain why.

// first part of my writing; this is just part1 [[User:Sschnei1|Sschnei1]] 00:35, 24 November 2010 (UTC)

For the comprehension of the reader, it is important of a paper/article/essay to have a good overview/layout. The introduction clearly describes what the reader has to expect in the following pages, especially what problems are addressed and how they are solved.

This paper gives multiple examples about advantages and disadvantages in an AVM. A good example is "Cheat Detection". Cheaters use programs to go around the original game code to gain an major advantage over other players. Since an AVM is generic in cheat detection it has a wider support for detecting cheats than most of the other cheat detection algorithms. The logs give the game the function to replay the game. Thus, players using AVM can see the way other players play by replaying the game with the player's log.

The negative side is that the player might have to suffer from the AVM. Everything is being logged and stored on the hard drive, which takes a lot amount of space. In the example in the paper it is 148mb per hour after compression. This reduces the fps. Additionally, the connection to the AVM increases the ping time to the server.

The test case for the AVM was using it to detect people using cheats in the popular online game Counter-Strike. They were using “Dell Precision T1500 workstations, with 8 GB of memory and 2.8 GHz Intel Core i7 860 CPUs”[pg 10]. These machines are considerably more high powered than the system requirements of Counter-Strike, which are “500 MHz processor, 96 MB RAM”[reference?]. A 10 year old game [reference?] should use fewer resources on a Dell Precision T1500 workstations. In comparison, newer games consume far more resources than Counter-Strike giving it less room to run the AVM. A 13% slowdown [pg 12.] in a game where you are only getting 30 to 40 fps is a pretty noticeable slowdown. This is very detrimental to the game play because having over 60fps is the optimal performance.

In the paper the authors state that the AVM will only generate an extra 5ms of latency. While this does not seem like a lot the measurement was taken over a LAN with all the computers connected to the same switch [pg 12]. This sample does not accurately represent real life situations and therefore lacks external validity, since many of these online games are played over the internet with the participants sometimes not even on the same continent; the latency overhead of the AVM would certainly increase due to the added distance. [networking textbook pg.41-42]

==References==

You will almost certainly have to refer to other resources; please cite these resources in the style of citation of the papers assigned (inlined numbered references). Place your bibliographic entries in this section.

[1] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and
A. Warfield. Remus: High availability via asynchronous virtual
machine replication. In Proceedings of the USENIX Symposium
on Networked Systems Design and Implementation (NSDI), Apr.
2008.

[2] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[3] G. Hoglund. 4.5 million copies of EULA-compliant spyware.
http://www.rootkit.com/blog.php?newsid=358.

[4] PunkBuster web site. http://www.evenbalance.com/.

[5] N. E. Baughman, M. Liberatore, and B. N. Levine. Cheat-proof
playout for centralized and peer-to-peer gaming. IEEE/ACM
Transactions on Networking (ToN), 15(1):1–13, Feb. 2007.

[6] C. M¨onch, G. Grimen, and R. Midtstraum. Protecting online
games against cheating. In Proceedings of the Workshop on Network
and Systems Support for Games (NetGames), Oct. 2006.

[7] A. Haeberlen, P. Kuznetsov, and P. Druschel. PeerReview: Practical
accountability for distributed systems. In Proceedings of
the ACM Symposium on Operating Systems Principles (SOSP),Oct. 2007.

[8] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[9] VMWare Workstation 6.5.1 web site. http://www.vmware.com/products/workstation/

=Discussion=
We can use this area to discuss or leave notes on general ideas or whatever you want to write here.

-The current due date posted on the site for this essay is November 25th --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

-I think that since we are given the headings to this article, we can easily choose what parts each member would like to work on, obviously since there are more members than parts, multiple members will have to work on the same parts or can work on all parts, I guess it's really up to you. I know that most people have a lot of projects coming up so let's try to get this done asap, or at least bit by bit so it's not something we have to worry too much about. --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

- I would like to do the Contribution or Critique. -- [[User:Sschnei1|Sschnei1]] 02:40, 20 November 2010 (UTC)

- I can either work on Background Concepts, or Research problem. -[[User:Jbaubin|Jbaubin]]

- I'm not sure whether the background concepts should be in point form or a paragraph, and whether it needs to be very long or not, but I shall work on both background concepts and research problem with you Jbaubin. --[[User:Mchou2|Mchou2]] 18:11, 21 November 2010 (UTC)

-Sounds good, and As i was going to post what I had for research problem, I just saw you posted a big chunk of it. I'll be out for a while, but tonight I'll take a serious look at what you write and add what I had written. - [[User:Jbaubin|Jbaubin]]

- Sorry I didn't write anything yet to Critique. I'm making my notes and will post something tonight or tomorrow. -- [[User:Sschnei1|Sschnei1]] 14:50, 22 November 2010 (UTC)

- I have started work on the contribution section. I'll have something up today or tomorrow. --[[User:Hirving|Hirving]] 19:55, 23 November 2010 (UTC)

-if anyone has information that they are working on they can just post it up and at least others can look at it and maybe build up stuff on it, and I'm sure everyone is aware of the extension that we got also, but let's try to finish this in the next few days --[[User:Mchou2|Mchou2]] 20:43, 23 November 2010 (UTC)

- I agree with finishing it in the next few days. Then we have more time to focus on other courses like 3004. I will post something later that night. -- [[User:Sschnei1|Sschnei1]] 21:29, 23 November 2010 (UTC)

- Just added my contribution section, can someone proof read and sign it before I move it over to the essay. I didn't do the "why is it better" part because I found the implementation took a lot of writing. For anyone that wants to do the other part, I'd suggest comparing AVMs to PunkBuster and/or VAC, and a cloud computing service (focusing on the auditing). Cheers --[[User:Hirving|Hirving]] 19:44, 24 November 2010 (UTC)

- I started that what is better/worse part in the Critique section. I will add the comparison with AVMs to Punkbuster and/or VAC soon. I personally feel like there is not that much to write for the Critique section. -- [[User:Sschnei1|Sschnei1]] 20:39, 24 November 2010 (UTC)

-Hay. I got a bit to add to your Critique section section. Its mostly expanding on your last paragraph and a bit on how the tests were performed. ill post my stuff later tonight, I just need to find some sources for my argument.--[[User:Pcox|Pcox]] 01:06, 25 November 2010 (UTC)

Talk:COMP 3000 Essay 2 2010 Question 4

2010-11-25T01:06:33Z

Pcox:

= Group Essay 2 =

Hello Group. Please post your information here. I assume everybody read the email at your connect account. Anyone specific wants to send him the email with the group members inside? If not, I just go ahead tomorrow at about 13:00 and send the email with the group members who wrote their contact information in here. - [[User:Sschnei1|Sschnei1]] 03:25, 15 November 2010 (UTC)
<br />

Sebastian Schneider sschnei1@connect.carleton.ca

Matthew Chou mchou2@connect.carleton.ca

Mark Walts mwalts@connect.carleton.ca

Henry Irving hirving@connect.carleton.ca

Jean-Benoit Aubin jbaubin@connect.carleton.ca

Pradhan Nishant npradhan npradhan@connect.carleton.ca

Only Paul Cox didn't answer i sent this morning.

Cox Paul pcox

And I just sent an email to the teacher.

--Jean-Benoit

==Paper==

the paper's title, authors, and their affiliations. Include a link to the paper and any particularly helpful supplementary information.

'''Title:''' Accountable Virtual Machines

'''Authors:''' Andreas Haeberlen, Paarijaat Aditya, Rodrigo Rodrigues, Peter Druschel

'''Affiliates:'''
University of Pennsylvania, Max Planck Institute for Software Systems (MPI-SWS)]

'''Link to Paper:''' [http://www.usenix.org/events/osdi10/tech/full_papers/Haeberlen.pdf Accountable Virtual Machines]

'''Supplementary Information:''' [http://research.microsoft.com/en-us/people/sriram/druschel.pptx Accountable distributed systems and the accountable cloud] - background of similar AVM implementation for distributed systems.

==Background Concepts==

Explain briefly the background concepts and ideas that your fellow classmates will need to know first in order to understand your assigned paper.

'''Accountable Virtual Machine (AVM)'''

'''Deterministic Replay''': A machine can record its executions into a file so that it can be replayed in order to see the executions and follow what was happening on the machine. Remus [[#References | [1]]] has contributed a highly efficient snap-shotting mechanism for these replays.

'''Accountability:''' Accountability in the context of this paper means that every action done on the virtual machine is recorded and will be used against the machine or user to verify the correctness of the application. The AVM is responsible of its action and will answers for its action against an auditor.

'''Remote Fault Detection:''' There are programs like GridCop[[#References | [2]]] that can be used to monitor the progress and execution of a remotely executing program by requesting a beacon packet. When the remote computer is sending the packets, the receiving/logging computer must be a trusted computer (hardware,software, OS) so that the receiving of packets remains consistent. To detect a fault in a remote system, every packet must arrive safely, and any interrupts during the logging must be handled or the inconsistencies will result in an inaccurate outcome. The AVM does not require trusted hardware and can be used over wide-area networks.

'''Cheat Detection:''' Cheating in games or any specific modification in a program can be either scanned[[#References | [3][4]]] for or prevented[[#References | [5][6]]] by certain programs. The issue with these scanning and preventative software is the knowledge/awareness of specific cheats or situations that the software can handle. An AVM is designed to counter any kind of general cheat.

'''Integrity Violations:''' This refers how the consistency of normal/expected operations of an execution does not equal to that of the host/reference (Trusted) execution, hence a violation has occurred.

- The word "node" is used to refer to a computer or server in order to represent the interactions between one computer and another, or a computer and a server.

==Research problem==

What is the research problem being addressed by the paper? How does this problem relate to past related work?
**Possible alternative for the first part :

The research presented in this paper tries to tackle a problem that has haunted computer scientists for a long time. How can you be sure that the software running on a remote machine is working correctly or as intended. Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a trust relation between users and a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done would be independent of the node and only dependent on the intended software. Let's say, that node A interacts with node B with execution exe1 and node A interacts with node C also with ex1, but node C has been modified and respond with exe2. Thus, we can assume that the respond of B and C will be different. Being able to prove that the node C has been modified without any doubt is the purpose of this paper.
***Let me know what you think about it. I removed the redundant part, and I think made it clearer and more concise. [[User:Jbaubin|Jbaubin]]

** looks good to me, we'll put this part into the final essay instead of mine below --[[User:Mchou2|Mchou2]] 20:03, 22 November 2010 (UTC)

/// omit

Cloud computing, online multi-player games, and other online services such as auctions are only a few examples that rely on a system of trust between users and a host. These different examples must have a certain amount of trust between the interactions of one user and another, as well as the user interacting with a host. When a node (user or computer) expects some sort of result or feedback from another node, they would hope that that interaction being done with node A is the same it would be done with another node, node B. Let's say for example that node A interacts with node B with execution exe1, now when node A and B interact with node C, they would both expect to interact with execution exe1, but what happens if node C interacts differently and executes with exe2, then it would be beneficial to be notified of this difference. The previous explanation might not seem too relevant without some examples, such as; Node A is playing a game with node B, the game executed on node B is the same as on A, now when node A plays with node C, node C is executing the same operations as node A plus a cheating program; when node A buys some products from node B's server, the server processes the order and then deletes node A's sensitive information, denoted by execution 1, now when node A buys from node C's server, the order is processed as well as the sensitive information that node A has provided is also rerouted to another server so that it can be used without permission. These are only a few examples where the operations in an execution is necessary to be logged and verified. The problem that is trying to be handled here is to create a procedure that can be done so that a node can be known as accountable, and to log the operations in an execution to provide evidence of these faults done by a node.

////

Previous work that has been done in efforts to prevent or detect integrity violations can be separated into different categories of operations. The first would be Cheat Detection, where in many different games there are cheats that users use to usually create benefits for themselves that was not intended by the original game.[[#References |[4]]] These detectors are not dynamic, in the sense that they do not actually detect whether a cheat is being used, more so they are checking if there is a cheating operation that they have logged before, being operated on the user's system. For example, if there was a known cheating program named aimbot.exe that can be run in the background of a game such as CounterStrike, and the PunkBuster system that was implemented on the user's system had the aimbot.exe program already logged as a cheating program from the developers, the PunkBuster program might notify the current game servers of this or even prevent the user from playing any games until the aimbot.exe operation is no longer running.

Accountability is another important problem that many have already worked on. The main goal of an accountable system is to be able to determine without a doubt that node is faulty and can prove it with solid evidence. It can also be used to defend a node when threatened with false accusation. Numerous systems already use accountability in their system, but they were mostly all linked to specific applications, where a point of reference must be used to compare. As example PeerReview[[#References |[7]]], which is a system closely related to what the research team have worked on, must be implemented into the application which makes it less portable and cannot be implemented as easily as an AVM. PeerReview verifies the inbound and outbound packets and can see if the software is running as intended.

Another problem that is related to the paper is remote fault detection in a distributed system. How can we determine if a remote node is running the code correctly or if the machine itself is working as intended. Network activity is a common solution to this problem, as they look at the inbound and outbound of the node. This can let them know how the software is operating, or in the case of AVM how the whole virtual machine is working. Gridcop[[#References |[8]]] is another example that inspects a small number of packets periodically. Another way of determining the fault remotely is to use a trusted node, where it can tell immediately if a fault occurs or a modification is made where it should not have been made.

-and anything else you would to add or modify, or leave a note in the discussion sections if you want me to relook or change something. --[[User:Mchou2|Mchou2]] 20:10, 21 November 2010 (UTC)

The problem of logging and auditing the processes of an execution of a specific node (computer) is greatly dependent on the work done for deterministic replay. Deterministic replay programs can create a log file that can be used to replay the operations done for some execution that occurs on a node. Replaying the operations done on the node can show what the node was doing, and this would seem like it is sufficient in finding out whether a node was causing integrity violations or not. The concept of snap-shoting/recording the operations is not the issue with deterministic replay, it is the fact that the data being outputted into the replay may be tampered with by the node itself so that it generates optimal results in replay. By faking the results of the operations, the auditing computer will falsely believe that the tested computer is running all operations as normal. The logging operations done by these recording programs can be directly related to the work needed to detect integrity violations.

==Contribution==

What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)

The accountable virtual machine (AVM), that was proposed in this essay, most useful contribution was the implementation of the accountable virtual machine monitor (AVMM). It is what allows for the fault checking of virtual machines in a cloud computing environment. The AVMM can be broken down into different parts: the virtual machine monitor (VMM), the temper-evident log, and auditing mechanisms. The VMM is based off the VMM found in VMWare Workstation 6.5.1[[#References |[9]]], the temper-evident log was adapted from code in PeerReview[[#References |[7]]], and the audit tools were built up from scratch.

The accountable virtual machine monitor relies on four assumptions:

1. All transmitted messages are received, if retransmitted sufficiently often.

2. Machines and Users have access to a has function that is pre-image resistant, second pre-image resistant, and collision resistant.

3. All parties have a certified keypair, that can be used to sign messages.

4. To audit a log, the user has a reference copy of the VM used.
The job of the AVMM is to record all incoming and outgoing messages to tamper-evident log.
and enough info of the execution to enable deterministic replay.

The AVMM must record nondeterministic inputs (such as hardware interrupts), because the input is asynchronous, the exact timing of input must be recorded, so that the inputs can be injected at the same points during replay. Wall-clock time is not accurate enough for this recording, so the AVMM must use a combination of instruction pointer, branch counter, and, possibly, additional registers. Not all inputs have to be recorded this way (software interrupts), because, they send requests to the AVM, which will be issued again during replay.

Two parallel streams appear in the tamper-evident log: message exchanges and nondeterministic inputs.
It is important for the AVMM to detect inconsistencies between the user's log and the machine's log (in case of foul play), so the AVMM simply cross-references messages and inputs during replay, thus, easily detecting any discrepancies.

The AVMM periodically takes snapshots of the AVM's current state, this facilitates fine-grain audits for the user, but it also increases overhead. The overhead is lowered slightly by the snapshots being incremental (only save the state that has been changed since the last snapshot). The user can authenticate the snapshot using a hash tree of the state (generated by the AVMM), the AVMM updates the hash tree after each snapshot.

'''Tamper-Evident Log'''

The log is made up of hash code entries.
Each log entry in form e = (s,t,c,h)
s = monotonically increasing sequence number
t = type
c = data of the type
h = hash value

The hash value is calculated by: h = H(hi-1 || s || t || H(c))
H() is a hash function.
|| stands for concatenation

Each message sent gets signed with a private key, when the AVMM logs the messages with the signature attached but removes it before sending it to the AVM. To ensure nonrepudiation, an authenticator is attached to each outgoing message.

To detect when a message is dropped, each party sends an acknowledgement for each message they receive. If an acknowledgement is not received the message is resent a few times, if the user stops receiving messages, then the machine is presumed to of failed.

To preform a log check, the user retrieves a pair of authenticators, then challenges the machine to produce the log segment between the two. The log is computationally infeasible to edit without breaking the hash chain, thus, if the log has been tampered with the hash chain will be different and the user will notified of the tampering.

'''Auditing Mechanism'''

From VVM's perspective all things are deterministic.

To preform a audit the user:

1. obtains a segment of the machine's log and the authenticators

2. downloads a snapshot of the AVM at the beginning of the segment

3. replays the entire segment, starting from the snapshot, to verify the events in the log are the correct execution of the software.

The user can verify the execution of software through three different methods: Verifying the log, snapshot, and execution.

When the user wants to verify a log segment, the user retrieves the authenticators from the machine with the sequence numbers in the range of the log segment. The user then downloads the log segment from the machine, and, starting with the most recent snapshot before the log segment and ending with the most recent snapshot before the end of the log segment. The user then check the authenticators for tampering. If this step proceeds, the user can assume the log segment executed properly. If the machine is faulty, the segment will be unavaible to download or may return a corrupted log segment. This can be used to convince a third party of the fault.

When the user wants to verify the snapshot, the user obtains a snapshot of the AVM's state at the beginning of the log segment. The user then downloads a snapshot from the machine and the AVMM recomputes the hash tree. The new hash tree is compared to the hash tree contained in the orignal log segment. If any discrepancies are detected, the user can use this to convince a third party of machine's fault.

In order for the user to verifying the execution of a log segment, the user needs three inputs: the log segment, the snapshot, and the public keys of the machine and any users of the machine. The auditing tool performs two checks on the log segment, a syntactic check (determines if log is well-formed), and a semantic check (determines if the information in the log shows the correct execution of the machine).

The syntactic check checks whether all log entries are in the proper format, the signatures in each message and acknowledgement, if each message was acknowledged, and the sequence of sent and received messages is correct when compared to the sequence of messages that enter and exit the AVM.

The semantic check creates a local VM that will execute the machine's log segment, the VM is initialized with a snapshot from the machine if possible. The local VM then runs the log segment and the data is recorded. The auditing tool then checks the log segments, inputs, outputs, and verification of snapshot hashes of the replayed execution against the original log. If any discrepancies are detected then the fault is reported and can be used as evidence as fault.

Why is it better?
[To Do]

==Critique==

What is good and not-so-good about this paper? You may discuss both the style and content; be sure to ground your discussion with specific references. Simple assertions that something is good or bad is not enough - you must explain why.

// first part of my writing; this is just part1 [[User:Sschnei1|Sschnei1]] 00:35, 24 November 2010 (UTC)

For the comprehension of the reader, it is important of a paper/article/essay to have a good overview/layout. The introduction clearly describes what the reader has to expect in the following pages, especially what problems are addressed and how they are solved.

This paper gives multiple examples about advantages and disadvantages in an AVM. A good example is "Cheat Detection". Cheaters use programs to go around the original game code to gain an major advantage over other players. Since an AVM is generic in cheat detection it has a wider support for detecting cheats than most of the other cheat detection algorithms. The logs give the game the function to replay the game. Thus, players using AVM can see the way other players play by replaying the game with the player's log.

The negative side is that the player might have to suffer from the AVM. Everything is being logged and stored on the hard drive, which takes a lot amount of space. In the example in the paper it is 148mb per hour after compression. This reduces the fps. Additionally, the connection to the AVM increases the ping time to the server.

==References==

You will almost certainly have to refer to other resources; please cite these resources in the style of citation of the papers assigned (inlined numbered references). Place your bibliographic entries in this section.

[1] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and
A. Warfield. Remus: High availability via asynchronous virtual
machine replication. In Proceedings of the USENIX Symposium
on Networked Systems Design and Implementation (NSDI), Apr.
2008.

[2] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[3] G. Hoglund. 4.5 million copies of EULA-compliant spyware.
http://www.rootkit.com/blog.php?newsid=358.

[4] PunkBuster web site. http://www.evenbalance.com/.

[5] N. E. Baughman, M. Liberatore, and B. N. Levine. Cheat-proof
playout for centralized and peer-to-peer gaming. IEEE/ACM
Transactions on Networking (ToN), 15(1):1–13, Feb. 2007.

[6] C. M¨onch, G. Grimen, and R. Midtstraum. Protecting online
games against cheating. In Proceedings of the Workshop on Network
and Systems Support for Games (NetGames), Oct. 2006.

[7] A. Haeberlen, P. Kuznetsov, and P. Druschel. PeerReview: Practical
accountability for distributed systems. In Proceedings of
the ACM Symposium on Operating Systems Principles (SOSP),Oct. 2007.

[8] S. Yang, A. R. Butt, Y. C. Hu, and S. P. Midkiff. Trust but
verify: Monitoring remotely executing programs for progress
and correctness. In Proceedings of the ACM SIGPLAN Annual
Symposium on Principles and Practice of Parallel Programming
(PPoPP), June 2005.

[9] VMWare Workstation 6.5.1 web site. http://www.vmware.com/products/workstation/

=Discussion=
We can use this area to discuss or leave notes on general ideas or whatever you want to write here.

-The current due date posted on the site for this essay is November 25th --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

-I think that since we are given the headings to this article, we can easily choose what parts each member would like to work on, obviously since there are more members than parts, multiple members will have to work on the same parts or can work on all parts, I guess it's really up to you. I know that most people have a lot of projects coming up so let's try to get this done asap, or at least bit by bit so it's not something we have to worry too much about. --[[User:Mchou2|Mchou2]] 05:18, 19 November 2010 (UTC)

- I would like to do the Contribution or Critique. -- [[User:Sschnei1|Sschnei1]] 02:40, 20 November 2010 (UTC)

- I can either work on Background Concepts, or Research problem. -[[User:Jbaubin|Jbaubin]]

- I'm not sure whether the background concepts should be in point form or a paragraph, and whether it needs to be very long or not, but I shall work on both background concepts and research problem with you Jbaubin. --[[User:Mchou2|Mchou2]] 18:11, 21 November 2010 (UTC)

-Sounds good, and As i was going to post what I had for research problem, I just saw you posted a big chunk of it. I'll be out for a while, but tonight I'll take a serious look at what you write and add what I had written. - [[User:Jbaubin|Jbaubin]]

- Sorry I didn't write anything yet to Critique. I'm making my notes and will post something tonight or tomorrow. -- [[User:Sschnei1|Sschnei1]] 14:50, 22 November 2010 (UTC)

- I have started work on the contribution section. I'll have something up today or tomorrow. --[[User:Hirving|Hirving]] 19:55, 23 November 2010 (UTC)

-if anyone has information that they are working on they can just post it up and at least others can look at it and maybe build up stuff on it, and I'm sure everyone is aware of the extension that we got also, but let's try to finish this in the next few days --[[User:Mchou2|Mchou2]] 20:43, 23 November 2010 (UTC)

- I agree with finishing it in the next few days. Then we have more time to focus on other courses like 3004. I will post something later that night. -- [[User:Sschnei1|Sschnei1]] 21:29, 23 November 2010 (UTC)

- Just added my contribution section, can someone proof read and sign it before I move it over to the essay. I didn't do the "why is it better" part because I found the implementation took a lot of writing. For anyone that wants to do the other part, I'd suggest comparing AVMs to PunkBuster and/or VAC, and a cloud computing service (focusing on the auditing). Cheers --[[User:Hirving|Hirving]] 19:44, 24 November 2010 (UTC)

- I started that what is better/worse part in the Critique section. I will add the comparison with AVMs to Punkbuster and/or VAC soon. I personally feel like there is not that much to write for the Critique section. -- [[User:Sschnei1|Sschnei1]] 20:39, 24 November 2010 (UTC)

-Hay. I got a bit to add to your Critique section section. Its mostly expanding on your last paragraph and a bit on how the tests were performed. ill post my stuff later tonight, I just need to find some sources for my argument.--[[User:Pcox|Pcox]] 01:06, 25 November 2010 (UTC)

COMP 3000 Essay 1 2010 Question 10

2010-10-15T07:37:20Z

Pcox: /* Conclusion */ added a bit to conclusion

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as ''wear leveling''. To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

==Flash Optimized File Systems==

The process of wear leveling ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased. [8]

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

===Why a Log File System is efficient for flash had drive===
The file system only writes a bank at a time. This means that the OS can save up the small random writes and write them all at the same time in a bank. This will cut down on the use of the expensive write command improving the overall performance of the hard drive.[9]

If a collision occurs when writing a new bank, the file system will sends the new data to an empty bank rather then erasing the existing bank and replacing it. This will cut down on using the erase function improving the life of the hard drive. Also since it only performs writes command and not a erase command and write command like a traditional file system this improves performance of the drive as well.[16]

===Systems Developed For Flash===
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system keeps a log of the nodes and when each node was last updated. The system also keeps track of inodes. These inodes keep a list of nodes that correspond to the relevant data. The nodes also contain information about which inode it belongs to. When the drive is mounted, the system scans all the nodes on the drive, and rebuilds the directory.
As the directory gets built, the nodes containing the data also map the physical location for each piece of data.[14]

When writing data to the drive, the nodes carrying that data get attached to the end of the log.

==Conclusion==

In this way, thanks to its use of banks for organizing data, the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

# Even though flash drives are exponentially faster than traditional HDDs, why are HDDs still the main method of data storage?
# Writing and erasing data are costly operations for a flash based storage drive. Why does modifying data (even a single bit) take the most amount of time?
# Why is the Flash Translational Layer so important to a flash drive's functionality? Why can you not use the traditional interface to deal with the block layer?

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T06:49:11Z

Pcox: /* Flash Optimized File Systems */

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of wear leveling ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

===Why a Log File System is efficient for flash had drive===
The file system only writes a bank at a time. This means that the OS can save up the small random writes and write them all at the same time in a bank. This will cut down on the use of the expensive write command improving the overall performance of the hard drive.[9]

If a collision occurs when writing a new bank, the file system will sends the new data to an empty bank rather then erasing the existing bank and replacing it. This will cut down on using the erase function improving the life of the hard drive. Also since it only performs writes command and not a erase command and write command like a traditional file system this improves performance of the drive as well.[16]

===Systems Developed For Flash===
In 1999 a Swedish company by the name of Axis Communications, developed and released a file system that was designed specifically to be run on a flash drive. Instead of mapping each physical address space with an emulated block sector, the system creates nodes that store data. The system also keeps track of inodes. These inodes keep a list of several nodes that correspond to the relevant data.

developed by Axis Communications and first released in 1999
- nodes containing metadata and data are stored sequentially on the device
- each node is associated with a single inode
- starts with header contianing the inode number of the inode to which it belongs and all the current file system metadata for that inode
- node also stores a version number that is written with a version higher than all previous nodes belonging to the same inode
- also stores the inode number and is never reused
- inode metadata includes uid, gid, mtime, atime, mtime, etc
- inode points to several nodes

OPERATION
- the entire drive is scanned when it is mounted
- the raw nodes have the information needed to rebuild the directory and a map for each inode of the phyical location
- when writing files, a node holding the data is attached to the end of the log

==Conclusion==

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T06:02:00Z

Pcox: Pull out a paragraph in Plash memory. Points already mention above

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

This extreme read speed makes flash drives a preferred method of storing games. This effectively makes loading times virtually non-existent. There is however a downside to this method. Games constantly save, modify, and change files wearing out the blocks much quicker. Flash drives have been shown be effective for the use in web-servers for running their CSS scripts or HTML pages.

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

The transistors that store the data are created with a thin strip of Silicon Oxide separating them. When the erase operation is called on the block where the transistors are located, the system fires electrons down the strip, wiping whatever bits the transistors are holding.

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Translational Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

==Conclusion==

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T05:01:14Z

Pcox: A little history in the intro

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. It started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically.[7] Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

'''1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?) [don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.'''
'''Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say. With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts. ~Source: Years in the 'biz'''
'''Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips'''

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Transition Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

=Conclusion=

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T04:42:25Z

Pcox:

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

'''1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?) [don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.'''
'''Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say. With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts. ~Source: Years in the 'biz'''
'''Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips'''

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Transition Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive.[3] This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

=Conclusion=

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

COMP 3000 Essay 1 2010 Question 10

2010-10-15T04:36:21Z

Pcox: added wear leveling to Flash Optimized File Systems

=Question=

How do the constraints of flash storage affect the design of flash-optimized file systems? Explain by contrasting with hard disk-based file systems.

=Answer=
First introduced in the late 80s, Flash-memory is a light, energy-independent, compact, shock-resistant and efficiently readable type of storage. Because of the particular limitations of this kind of memory, flash file systems require a fundamentally different system architecture than disk-based file-systems: these systems need to be designed in light of flash-memory’s limited number of erase-cycles and its need to conduct erasures one entire block at a time. These constraints are a direct result of the same design that gives flash its advantages with regard to [ TO WHAT?] as both are due to [TO WHAT?] . Thus, a typical disk-based file-system is not suitable for working with flash memory as it erases far too frequently and indiscriminately while being simultaneously optimized for other constraints that do not affect flash memory. This means that a different solution is necessary and that solution is the log-based file-system which is far better suited to working with flash memory because it optimizes erasures by [WHAT?].

==Flash Memory==
Flash memory is non-volatile(meaning digital storage that does not require power to retain its memory) storage space that has become more popular recently due to its fast fetch times. There are two basic forms of the flash storage system, NOR and NAND. Each type has its advantages and disadvantages. NOR has the fastest read times, but is much slower at writing. NAND on the other hand has much more capacity, faster write times, is less expensive, and has a much longer life expectancy.[2]

More and more people use flash memory, with many sizes of drives, ranging from a few hundred megabyte USB key, to a few terabyte internal solid-state drive(SSD). Two main reasons for this movement are because of flash's extremely fast read times, and its falling price. A typical flash drive has read speeds of up to 14 times faster than a hard disk drive (HDD).[17]

Although flash drives are exponentially faster than HDDs, they still have not become the main source of data management. The reason for this is because HDDs are simply much cheaper, and flash drives still have many faults. The most critical fault is that each block in flash memory can only be erased approximately 100,000 times.[14] This poses a problem because when modifying a file, even if its a single bit, the entire block must be erased, and rewritten. This erase/rewrite slows down the write operation considerably, making it actually slower to write a file to flash than an HDD.[8]

'''1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?) [don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.'''
'''Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say. With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts. ~Source: Years in the 'biz'''
'''Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips'''

HDDs use a block system, in which the kernel specifies which blocks to read and write. When using a flash drive, the blocks are emulated and mapped to a physical memory address. It does through what is called a "Transition Layer".

==Traditionally Optimized File Systems==

Since the kernel asks for a block number, a conventional hard disk drive (HDD) file-system is not optimized to work with flash memory. The reason for this is that conventional hard-disks have different constraints from those of flash memory - their primary problem is to reduce seeking time, while the primary problem when working with flash memory is to erase in a minimal and balanced way.

The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk in order to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically. This is also why defragmentaion, a procedure used by HDDs to put files into more convenient configurations and thus minimize seeking times, loses its purpose in a flash memory context. Indeed, the unnecessary erasures that it entails are both inefficient and harmful for a flash memory unit.

This comes directly out of flash memory's aforementioned constraints: the slow block-sized erasures and the limited number of erase-cycles. Because of these, a flash optimal file system needs to minimize its erase operations and also to spread out its erasures in such a way as to avoid the formation of hot-spots: sections of memory which have undergone a disproportionately high number of erasures and are thus in danger of burning out. This process of spreading out data is referred to as "wear leveling". To minimize hotspots, a system using flash memory would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to conduct necessary erasure operations while the system is idle. It makes better sense to do these at this time because of the slow nature of erasures in the flash memory context. Of course, there is no such feature in a traditional HDD file-system.

'''More on this later'''

==Flash Optimized File Systems==

The process of "wear leveling" ensures that he drive does not keep erasing and writing to the same block over and over. This is achieved by writing data that doesn’t change often to blocks that have been erased frequently. Wear leveling tries to make all the blocks use up there write cycles at an even pace increasing the overall life of the hard drive. This is achieved through a Log-based File System, often referred to as the Flash Transitional Layer(FTL). Essentially, the drive stores a log that keeps track of how many times each erase sector has been invalidated (or erased). The translational layer has a translation table, where each physical memory address is associated with an emulated block sectors. This allows a traditional file system that uses block sectors to be used on the flash drive. Each block has a flag which keeps track of its state. When a block is being written to, the FTL marks the blocks needed as ''allocated''. This prevents other data being written to the block that has already been allocated. The FTL then goes on to write the data in the allocated blocks. Once it completes the transaction, the system updates the allocated blocks to ''pre-valid''. Once that is completed, the drive marks the invalidated blocks to ''invalid'', while marking the newly written block as ''valid''. This entire flagging process is to ensure that the newly allocated blocks are never mixed up with the invalidated blocks.

===Banks===
The FTL organizes data using structures called banks. When the FTL gets a request to write something to memory, it uses a bank list to determine which area of the drive should be used. Essentially a bank is a group of sequential addresses, that keeps track of when it was last updated using timestamps. The FTL will only write to that bank, and once there is not enough space to write anymore, it switches out the current bank for the one with the most available space. When cleaning up the bank, the system puts it into what is called the Cleaning Bank List and removes it from the Bank List, thus avoiding any chance of some data being written to that bank while something is being erased.

===Cleaner===
When the FTL realizes that there is not enough room to write new data onto the drive, it runs a garbage collection routine. This routine selects a segment to be cleaned, copies all of the valid data into a new segment, then erases everything in the old segment. This frees up the otherwise useless invalidated blocks and by not erasing every block as soon as it becomes invalidated, it saves on the amount of times that the expensive erase operation is called. The kernel can preemptively clean the drive when the system is idle.

=Conclusion=

In this way, thanks to its [WHATEVER MAKES LOG FSs ACTUALLY GOOD AT DEALING WITH FLASH], the log-based file-system is far better suited to working with flash memory than a traditional HDD file system. The latter is utterly unfit for this task due to its placing primacy on the minimization of seeks rather than on the minimization and management of erasures. Dealing smartly with erasures is extremely important for a flash memory file system, as that memory type's particular weaknesses, the limited number of erasure cycles, the necessity to erase by the block and the relative slowness of the erasures themselves, all have to do with erasing. A good flash memory file system must therefore be built with the aim of making the best of these weaknesses and this is precisely the reason why older disk-based file systems are not suitable for flash memory while log-based file systems are. [INSPIRATIONAL LAST WORDS]

=Questions=

=References=

[1] Kim, Han-joon; Lee, Sang-goo. ''A New Flash Memory Management for Flash Storage System''. ''IEEExplore''. Dept. of Comput. Sci., Seoul Nat. Univ., 06 Aug 2002. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#>

[2] Smith, Lance. ''NAND Flash Solid State Storage Performance and Capability''. ''Flash Memory Summit''. SNIA Education Committee, 18 Aug 2009. <http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf>

[3] Chang, LiPin. ''On Efficient Wear Leveling for Large-Scale Flash-Memory Storage Systems''. ''Association for Computing Machinery (ACM)''. Dept. of Comput. Sci.,Nat. ChiaoTung Univ., 15 Mar 2007. <http://portal.acm.org/citation.cfm?id=1244248>

[4] Nath, Suman; Gibbons, Phillip. ''Online maintenance of very large random samples on flash storage''. ''Association for Computing Machinery (ACM)''. The VLDB Journal, 27 Jul 2007. <http://portal.acm.org/citation.cfm?id=1731355>

[5] Lim, Seung-Ho; Park; Kyu-Ho. ''An Efficient NAND Flash File System for Flash Memory Storage''. ''CORE Laboratory''. IEEE Transactions On Computers, Jul 2006. <http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf>

[6] ''NAND vs. NOR Flash Memory Technology Overview''. ''RMG and Associates''. Toshiba America, accessed 14 Oct 2010. <http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf>

[7] Bez, Roberto; Camerlenghi, Emilio; Modelli, Alberto; Visconti, Angelo. ''Introduction to Flash Memory''. ''IEEExplore''. STMicroelectronics, 21 May 2003. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1>

[8] Kawaguchi, Atsuo; Nishioka, Shingo; Motoda Hiroshi. ''A Flash-Memory Based File System''. ''CiteSeerX'' Advanced Research laboratory, Hitachi, Ltd., 1995. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142>

[9] Rosenblum, Mendel; Ouserhout, John. ''The Design and Implementation of a Log-structured File System''. ''Association for Computing Machinery (ACM)''. University of California at Berkeley, Feb 1992. <http://portal.acm.org/citation.cfm?id=146943&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973&ret=1#Fulltext>

[10] Shimpi, Anand. ''Intel X25-M SSD: Intel Delivers One of the World's Fastest Drives''. ''AnAndTech''. AnAndTech, 8 Sep 2008. <http://www.anandtech.com/show/2614>

[11] Shimpi, Anand. ''The SSD Relapse: Understanding and Choosing the Best SSD''. ''AnAndTech''. AnAndTech, 30 Aug 2009. <http://www.anandtech.com/show/2829>

[12] Shimpi, Anand. ''The SSD Anthology: Understanding SSDs and New Drives from OCZ''. ''AnAndTech''. AnAndTech, 18 Mar 2009. <http://www.anandtech.com/show/2738>

[13] Corbet, Jonathan. ''Solid-State Storage Devices and the Block Layer''. ''Linux Weekly News''. Linux Weekly News, 4 Oct 2010. <http://lwn.net/Articles/408428/>

[14] Woodhouse, David. ''JFFS : The Journalling Flash File System''. ''CiteSeerX''. Red Hat, Inc, Accessed 14 Oct 2010. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf>

[15] Agrawal, Nitin; Prabhakaran, Vijayan; Wobber, Ted; Davis, John; Manasse, Mark. Panigrahy, Rina. ''Design Tradeoffs for SSD Performance''. ''Association for Computing Machinery (ACM)'', USENIX 2008 Annual Technical Conference, 2008. <http://portal.acm.org/citation.cfm?id=1404014.1404019>

[16] Lee, Sang-Won, et al. ''A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation''. ''Association for Computing Machinery (ACM)''. ACM Transactions on Embedded Computing Systems (TECS), Jul 2007. <http://portal.acm.org/citation.cfm?id=1275990>

[17] ''Reach New Heights in Computing Performance''. ''Micron Technology Inc''. Micro Technology Inc, Accessed 14 Oct 2010. <http://www.micron.com/products/solid_state_storage/client_ssd.html>

[18] ''Flash Memories.'' 1 ed. New York: Springer, 1999. Print.

[19] ''Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 2008. Print.

[20] ''Nonvolatile Semiconductor Memory Technology: A Comprehensive Guide to Understanding and Using NVSM Devices''. ''IEEE Press Series on Microelectronic Systems''. New York: Wiley-Ieee Press, 1997. Print.

=External links=

Relevant Wikipedia articles: [http://en.wikipedia.org/wiki/Flash_Memory Flash Memory], [http://en.wikipedia.org/wiki/LogFS LogFS], [http://en.wikipedia.org/wiki/Hard_disk Hard Disk Drives], [http://en.wikipedia.org/wiki/Wear_leveling Wear Leveling], [http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29 Hot Spots], [http://en.wikipedia.org/wiki/Solid-state_drive Sold State Drive].

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-14T22:49:21Z

Pcox:

Hey all,

I think we should write down our emails here so we can further discuss stuff without having to login here.
('''***Note that discussions over email can't be counted towards your participation grade!***'''--[[User:Soma|Anil]])

Geoff Smith (gsmith0413@gmail.com) - gsmith6

Andrew Bujáki (abujaki [at] Connect or Live.ca)
***I'm usually on MSN(Live) for collaboration at nights, Just make sure to put in a little message about who you are when you're adding me. :)

I used Google Scholar and came to this page http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=812717&tag=1#
Which briefly touches on the issues of Flash memory. Specifically, inability to update in place, and limited write/erase cycles.

Inability to update in place could refer to the way the flash disk is programmed, instead of bit-by-bit, it is programmed block-by-block. A block would have to be erased and completely reprogrammed in order to flip one bit after it's been set.
http://en.wikipedia.org/wiki/Flash_memory#Block_erasure

Limited write/erase: Flash memory typically has a short lifespan if it's being used a lot. Writing and erasing the memory (Changing, updating, etc) Will wear it out. Flash memory has a finite amount of writes, (varying on manufacturer, models, etc), and once they've been used up, you'll get bad sectors, corrupt data, and generally be SOL.
http://en.wikipedia.org/wiki/Flash_memory#Memory_wear

Filesystems would have to be changed to play nicely with these constraints, where it must use blocks efficiently and nicely, and minimize writing/erasing as much as possible.

I found a paper that talks about the performance, capabilities and limitations of NAND flash storage.

Abstract: "This presentation provides an in-depth examination of the
fundamental theoretical performance, capabilities, and
limitations of NAND Flash-based Solid State Storage (SSS). The
tutorial will explore the raw performance capabilities of NAND
Flash, and limitations to performance imposed by mitigation of
reliability issues, interfaces, protocols, and technology types.
Best practices for system integration of SSS will be discussed.
Performance achievements will be reviewed for various
products and applications. "

Link: http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2009/20090812_T1B_Smith.pdf

There's no Starting place like Wikipedia, even if you shouldn't source it.

http://en.wikipedia.org/wiki/Flash_Memory

http://en.wikipedia.org/wiki/LogFS

http://en.wikipedia.org/wiki/Hard_disk

http://en.wikipedia.org/wiki/Wear_leveling

http://en.wikipedia.org/wiki/Hot_spot_%28computer_science%29

http://en.wikipedia.org/wiki/Solid-state_drive

Hey Guys,

We really don't have much time to get this done. Lets meet tomorrow after class and get our bearings to do this properly.

Fedor

A few of us have Networking immediately after class. I know personally I won't be able to make anything set on Tuesday.
Additionally, he spoke briefly about hotspots on the disk for our question last week, where places on the disk would be written to far more often than others.
As well, for bibliographical citing, http://bibme.org is a wonderful resource for the popular formats (I.e. MLA). If it should come down to that.
~Andrew

===links===

Start Posting some stuff to source from:

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1
--"Introduction to flash memory"

http://portal.acm.org/citation.cfm?id=1244248
--"Wear Leveling" (it's about a proposed way of doing it, but explains a whole bunch of other things to do that)

http://portal.acm.org/citation.cfm?id=1731355
--"Online maintenance of very large random samples on flash storage" (ie dealing with the constraints of Flash Storage in a system that might actually be written to 100000 times)

http://vlsi.kaist.ac.kr/paper_list/2006_TC_CFFS.pdf
--"An Efficient NAND Flash File System for Flash Memory Storage" discuses shortcomings of using hard disk based file systems and current flash based file systems

http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf
--"NAND vs NOR Flash Memory" (note: i didn't get this off of Google scholar but it seems to be written by someone from Toshiba. is that ok?)

Hi everybody,

So here are the latest news. Geoff, Andrew and myself had a meeting after class today and came up with a plan for writing this thing.

We decided to have 3 parts:

1. What flash storage is, why its good but also why it must have the problems that it does (the assumption is that it must have them, why would it otherwise?)
[don't know much about this just now... basics include that there is NOR (reads slightly faster)and NAND (holds more, writes faster, erases much faster, lasts about ten times longer) flash with NAND being especially popular for storage (what's NOR good for?). Here, we'd ideally want to talk about why flash was invented (supposed as an alternative to slow ROM), why it was suitable for that, and how it works on a technical level. Then, we'd want to mention why this technical functionality was innovative and useful but also why it came with two serious set-backs: having a limited-number of re-write cycles and needing to erase a block at a time.]

Either way, Flash storage affords far faster fetch times than the traditional platter-based HDD, and stability of information in a sense. Where the data is not actually stored, but reprogrammed, in a sense, the data is more secure and is less likely to be erased easily. On that note, in order to flip a single bit, that entire block will need to be erased, then reprogrammed. In an 'old' HDD, let's say, When the HDD fails at the end of its life cycle, your data is gone. (unless you're willing to shell out $200/hr to have it recovered, yes I've seen companies in Ottawa that do this.) In a flash HDD, when it reaches the end of its life, it merely becomes read-only. Bugger for Databases, but useful for technical notes and archives, let's say.
With today's modern gaming computers, Flash memory can be good on quick load times, however with limited read-writes, it could afford better use to things that are not updated as frequently. I.e... Well I don't have a better example than a webserver hosting a company's CSS and scripts.
~Source: Years in the 'biz

Flash memory started out as a replacement for EPROMs. At the time EPROMs needed a UV photoemission to be erased while flash memory could be erased electronically. The first flash memory product came out in 1988 but it did not take off until the late 1990’s because it could not be reliable produced. NOR and NAND memory is named after the arrangement of the cells in the memory array. NOR based flash memory benefits from having very fast burst read times but slower write times. Due to the structure of NOR memory programs stored in NOR based memory can be executed without being loaded into RAM first. NAND flash memory has a very large storage capacity and can read and write large files relatively fast. NAND is more suited for storage while NOR memory is better suited for direct program execution such as in CMOS chips.
source: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1199079&tag=1 , http://maltiel-consulting.com/NAND_vs_NOR_Flash_Memory_Technology_Overview_Read_Write_Erase_speed_for_SLC_MLC_semiconductor_consulting_expert.pdf

2. How a traditional disk-based file-system works and why the limitations of flash storage make the two a poor match
[the obvious answer seems to be that traditional file-systems could just write to whatever memory was available but if they did this with a flash file-systems, certain chunks of memory would become unusable before others and the memory would be more difficult to work with. Also, disk-based file systems need to deal with seeking times which means that they want to organize their data in such a way as to reduce those (by putting related things together?) - with Flash, this isn't really a problem and thus one constraint the less to be concerned with.]

3. How a log based file-system works and why this method of operation is so well suited to working with flash memory especially in light of the latter's inherent limitations
[...]

At this time, the plan is that Geoff will work on #3 today, Andrew will work on #1 tomorrow and I will work on #2 tomorrow. The three of us will make an effort to consult some somewhat more painfully technical literature in order to gain insight into our respective queries. Whatever insight we find will be posted here.

Then, we will meet again on Thursday after class to decide how to actually write the essay.

PS, if there is anybody in the group besides the three of us - let us know so you can find a way to contribute to this... as at least two of us are competent essayists, painfully technical research would on one or more of the above topics would be a great way to contribute... especially if you could post it here prior to one of us going over the same thing.

Fedor

-- I'm not that great (but absolutely horrid) at essays and I'm alright at research, but if nothing else I have Thursday off and nothing (else) that needs doing by Friday so I can probably spend a bunch of time working on it just before it's due. -- ''Nick L''

-- Hay sorry I was unable to attend the meeting after class today. I am not too good at writing essays as well but I am pretty good at summarizing and researching. I am not too sure at what you would like me to do. Right now I'll assume you need me to research/summarizing articles for the 3 topics above. If you need me to do anything else post it here. I'll be checking the discussion regularly until this due. once again sorry for missing the meeting-- Paul Cox.

-- Hey i'm also supposed to be in on this. Sorry i couldn't contribute sooner because i was playing catchup in my other classes. Let me know what i can do and i'll be on it asap. - kirill (k.kashigin@gmail.com)
update: i'm gonna be helping Fedor with #2

PS, this article http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw looks really useful for part 3.

---same article as above but shorter link: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.5142

PPS, and this article looks really great for understanding how log based file systems work: http://delivery.acm.org/10.1145/150000/146943/p26-rosenblum.pdf?key1=146943&key2=3656986821&coll=GUIDE&dl=GUIDE&CFID=108397378&CFTOKEN=72657973

Hey Luc (TA) here, Anandtech ran a series of articles on solid state drives that you guys might find useful. It mostly looked at hardware aspects but it gives some interesting insights on how to modify file systems to better support flash memory.

http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3403

http://www.anandtech.com/storage/showdoc.aspx?i=3531&p=1

http://anandtech.com/storage/showdoc.aspx?i=3631

--[[User:3maisons|3maisons]] 19:44, 12 October 2010 (UTC)

Hey Paul&Kirill,

If one of you guys could help me out with #2, that would be really great. I was going to work on that tomorrow, but I also have another large assignment to deal with and not having to do this research would greatly ease my life. Moreover, I do intend to work on writing&polishing the essay on Thursday as I have a lot of experience with that and it far more than research. Let me know if either one of you can help me with this.

The other person could probably read over what Luc posted for us and see if it fits into our framework. Just be sure to state who is going to do what.

Nick,

Honestly, we really hope to have the research done by Thursday. If that is the only day that you are free and you're not a writer, I'm honestly not sure what you could do. Perhaps someone else can think of something.

- Fedor

I'm gonna have something for #2 up tonight. -kirill

So I found this article on Reddit, posted from Linux Weekly News on pretty much exactly what we are looking at. It's entitled "Solid-state storage devices and the block layer"

http://lwn.net/SubscriberLink/408428/68fa8465da45967a/ --[[User:Gsmith6|Gsmith6]] 20:36, 13 October 2010 (UTC)

I wasn't exactly sure how much information i was supposed to present but here's what i got for #2:

Most conventional file systems are designed to me implemented on hard disk drives. This fact does not mean they cannot be implemented on a solid state drive (file storage that uses flash memory instead of magnetic discs). It would however, in many ways, defeat the purpose of using flash memory. The most consuming process for an HDD is seeking data by relocating the read-head and spinning the magnetic disk. A traditional file system optimizes the way it stores data by placing related blocks close-by on the disk to minimize mechanical movement within the HDD. One of the great advantages of flash memory, which accounts for its fast read speed, is that there is no need to seek data physically so there is no need to waste resources laying out the data in close proximity.
A traditional HDD file system will also attempt to defragment itself, moving blocks of data around for closer proximity on the magnetic disk. This process, although beneficial for HDD's, is harmful and inefficient for flash based storage. A flash optimal file system needs to reduce the amount of erase operations, since flash memory only has a limited amount of erase cycles as well as having very slow erase speeds.
When an HDD rewrites data to a physical location there is no need for it to erase the previously occupying data first, so a traditional disk based file system doesn't worry about erasing data from unused memory blocks. In contrast flash memory needs to first erase the data block before it can modify any of it contents. Since the erase procedure is extremely slow, its not practical to overwrite old data every time. It is also decremental to the life span of flash memory.
To maximize the potential of flash based memory the file system would have to write new data to empty memory blocks. This method would also call for some sort of garbage collection to erase unused blocks when the system is idle, which does not get implemented in conventional file systems since it is not needed.

--kirill

So Fedor and I were talking in the labs, and we came to the conclusion that we have been focusing on just the translation from a regular file system to a flash drive. We were under the impression that this was in fact the "Flash Optimized System", but pulling up some more articles, I'm finding that this is not necessarily the case.

This paper here http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6156&rep=rep1&type=pdf.
shows an example from Axis Communications where they developed a file system specifically designed to be used on flash drives.

Now I haven't completely read it, so it might just be an optimized translational system, but its at least a start.

At our meeting today, we've decided that it would be best if people could post a rough summary of their notes in the appropriate sections, and I will rewrite them into an essay, which Fedor will go through later tonight to edit and add some more information.

Paul: I missed your comment that you weren't that great at writing, and good at research. If you want some articles behind the pay-walls, I've saved a bunch of them and emailed them to myself. Just email me (at the address at the top of the page) and I'll be more than happy to send some your way.

PS. some more references

Design tradeoffs for SSD performance http://portal.acm.org/citation.cfm?id=1404014.1404019
A log buffer-based flash translation layer using fully-associative sector translation http://delivery.acm.org/10.1145/1280000/1275990/a18-lee.pdf?key1=1275990&key2=0709607821&coll=GUIDE&dl=GUIDE&CFID=105787273&CFTOKEN=74601780

--[[User:Gsmith6|Gsmith6]] 15:03, 14 October 2010 (UTC)

I don't have any notes on this computer. >: I will be adding more to my section later on tonight. Sorry. ~Andrew

Hello dudes,

Just a quick note, try to include citations in your paragraphs - each time that you make a claim which came from evidence, put a little number [X, pp. page-number (if applicable)] into your text. Then, put the same [X] at the bottom of the page with the bibliographical information about the source. The prof hasn't yet gotten back to me about his preferred citation format, so just stick with this one for now:

Authors. ''Title''. Web-page. Date of article. Web (the word). Date you accessed it.

Here's an example:

[1] Kawaguchi, Nishioka, Tamoda. ''A Flash Memory Based File System''.
http://docs.google.com/viewer?a=v&q=cache:E7-H_pv_18wJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.92.2279%26rep%3Drep1%26type%3Dpdf+flash+memory+and+disk-based+file+systems&hl=en&gl=ca&pid=bl&srcid=ADGEESgspy-jqIdLOpaLYlPPoM56kjLPwXcL3_eMbTTBRkI7PG0jQKl9vIieTAYHubPu0EdQ0V4ccaf_p0S_SnqKMirSIM0Qoq5E0NpLd0M7LAGaE51wkD0F55cRSkX8dnTqx_9Yx2E7&sig=AHIEtbS-yfGI9Y48DJ0WyEEhmsXInelRGw . 1995. Web. Oct. 14, 2010.

Fedor

PS, its a good idea to check this fairly frequently between now and tomorrow morning - you never know when something will come up.

Phew... for a while there I was starting to think that I had nothing about the actual "Log-Based System", but it turns out that the "Transitional Layer" is the same thing. It looks like some articles are calling it the Log system, while others are calling it the transitional layer. Pretty sure I'm going to have an experts knowledge about flash drives after reading all these articles :P
--[[User:Gsmith6|Gsmith6]] 18:13, 14 October 2010 (UTC)

Hay Geoff this is what i got so far after reading a couple of the pdfs. the double tabed points are just my annotation on how they relate to the question.

Ware leveling: p1126-chang.pdf

* Uneven wearing of flash memory due to storing data close together
* Garbage collection prefers that no blocks have pages that have data that is constantly becoming invalid
* data that remains the same for longs periods of time should be moved from block that have not be written to much and moved to blocks that haven been erased frequently.

Log file structure: 926-rosenblum.pdf

* LFS based on assumption that frequently read files will be stored in cash and that the hard disk traffic will be dominated by writes
* Writes all new info to disk in a sequential structure called a log
* Data is stored permanently in these logs no other data is stored on the hard drive
* Converts many small random synchronous writes to a large asynchronous sequential write
** Good for flash because it cuts down on writing (prolongs drive life)
** It also good because it writes to a bigger section then a page. This means it can fill a block at a time so it doesn’t fill up other blocks with random writes that would later need to be cleaned. Cuts down on cleaning.
* Inode is stored in the log on the disk while an inode map is maintained in memory which points to the inode in the hard disk. as
** This is good for flash drives because reading does not hurt the drives life and it is fast.
** This means the map will not have to be updated on the disk as frequently cutting down on the writes.
* Log systems weakness is that it is susceptible to becoming fragmented due to the larger writes.
** Since flash drives do not require fragmentation this is fine. Also since flash drives have very fast random access the system does not become bogged down when the logs are fragmented
* Log system implements a cleaning system that scans a segment in and sees if there is live data in it. If there is a certain percentage of invalid data which goes according to the cleaning policy it will be cleaned. All the live data will be copied out and the segment will be erased.
** This is a garbage collector but its built into the file system.
* Segments contain a number of blocks. Segments can contain logs or parts of logs.

I'm still sifting though the other pdfs you sent me. Would you like me to post more info or should i format this into a paragraph or 2?

--Paul

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-13T05:38:31Z

Pcox: added some links and notes on NOR and NAND flash memory

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-12T22:05:23Z

Pcox:

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-12T22:04:05Z

Pcox:

Talk:COMP 3000 Essay 1 2010 Question 10

2010-10-12T20:22:58Z

Pcox: