Soma-notes - User contributions [en]

COMP 3000 Essay 2 2010 Question 7

2010-12-03T04:31:19Z

Smcilroy: /* Accuracy */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them has a write privilege. This leads to processes modifying the data that all processes share as others may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times are very difficult to detect and manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.

===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to the difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other while avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the form of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing them. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify them just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variables periodically.

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques anyway. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronizations.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of applications. In testing, SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives strong indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T04:22:11Z

Smcilroy: /* Evaluation */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them has a write privilege. This leads to processes modifying the data that all processes share as others may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times are very difficult to detect and manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.

===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to the difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other while avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the form of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing them. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify them just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variables periodically.

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques anyway. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronizations.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives strong indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T04:21:02Z

Smcilroy: /* Contribution */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them has a write privilege. This leads to processes modifying the data that all processes share as others may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times are very difficult to detect and manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.

===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to the difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other while avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the form of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing them. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify them just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variables periodically.

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques anyway. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronizations.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T04:03:18Z

Smcilroy: /* Ad Hoc Synchronization */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them has a write privilege. This leads to processes modifying the data that all processes share as others may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times are very difficult to detect and manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.

===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to the difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T04:01:45Z

Smcilroy: /* Race conditions, deadlocks */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them has a write privilege. This leads to processes modifying the data that all processes share as others may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times are very difficult to detect and manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.

===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T03:59:17Z

Smcilroy: /* Synchronization Primitives */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]] and facilitate coordination between different threads. They come in many forms such as locks, mutexes, semaphores, and condition variables.
Locks maintain access to data and limit who has access when there are multiple threads. Examples of locks include read/write locks that only lock when a reader obtains the lock and latches, which unlock only after a specified number of threads have obtained them.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.

===Race conditions, deadlocks===
Race conditions are an unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them have write privilege. This leads to processes modifying the data that all processes share as other may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times very difficult to detect or manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.
===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T03:17:13Z

Smcilroy: /* Concurrent Programming */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor systems and in distributed environments this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization primitives represent some of the basic tools offered by the system containing the concurrent program to facilitate coordination between of threads of execution. They are generally used to synchronize between threads, and to protect shared resource[[#Foot8|8]].
Some type of synchronization primitives common to many are locks, mutexes, semaphores, and monitors.
Locks are really a superset of synchronization primitives since mutexes, semaphores, and monitors are all locks. Additionally there are read/write locks that only lock when a reader obtains the lock. Latches a type of lock that unlocks when a specified number of threads have obtained it which is very useful in facilitate threads all getting to known state, there are a myriad of other locks.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need preventing other. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Monitors are a type of mutex that contain a condition variables which is a variable that if not true, releases the lock and blocks the thread until the certain condition is met, by another thread changing value of the variable. This allows the original thread to only continue execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.
===Race conditions, deadlocks===
Race conditions are an unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them have write privilege. This leads to processes modifying the data that all processes share as other may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times very difficult to detect or manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.
===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-12-03T03:11:06Z

Smcilroy: /* Critique */

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor system and in distributed environment this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization primitives represent some of the basic tools offered by the system containing the concurrent program to facilitate coordination between of threads of execution. They are generally used to synchronize between threads, and to protect shared resource[[#Foot8|8]].
Some type of synchronization primitives common to many are locks, mutexes, semaphores, and monitors.
Locks are really a superset of synchronization primitives since mutexes, semaphores, and monitors are all locks. Additionally there are read/write locks that only lock when a reader obtains the lock. Latches a type of lock that unlocks when a specified number of threads have obtained it which is very useful in facilitate threads all getting to known state, there are a myriad of other locks.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need preventing other. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Monitors are a type of mutex that contain a condition variables which is a variable that if not true, releases the lock and blocks the thread until the certain condition is met, by another thread changing value of the variable. This allows the original thread to only continue execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.
===Race conditions, deadlocks===
Race conditions are an unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them have write privilege. This leads to processes modifying the data that all processes share as other may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times very difficult to detect or manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.
===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

After the initial set of loops found is culled through the above process, the remaining loops are determined to be sync loops, and are suitably annotated. Marking the source code with LLVM’s static instrumentation framework, it allows for other tools to take advantage of SyncFinder’s findings in their own analysis.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
===Style===
There is some unnecessary repetition in two sections of the essay, they list their findings from the study in the contribution section, but in the section that covers the characteristics of ad hoc synchronizations, they essentially repeat themselves with their previous findings. The two sections could have been combined.

===Evaluation===
The authors of the paper chose a mix of the leading concurrent open-source software programs[[#Foot10|10]] to base their study on. They were chosen to represent different uses of applications for server, desktop and scientific applications. The number of ad hoc synchronizations were determined by two authors who reviewed the source code themselves. They were both experienced with the code base, but mistakes could have been made. General conclusions would be hard to draw from the limited data set, but the study gives indicators of ad hoc synchronizations characteristics and their effects based on evidence from the software tested.

SynchFinder, the tool that the authors created has the benefit of a high degree of success, on average finding 96% of ad hoc synchronizations and can be extended to other data race and bad practice detector tools such as reducing Valgrind data race checker's false positives by 43%-86%. SynchFinder fills a niche where other tools have failed to detect ad hoc synchronizations before. On the downside SynchFinder produces 6% false positives. The false positives are due to lack of source code on library functions and incorrect pointer alias. But a programmer can then examine
the returned ad hoc synchronizations to review them for false positives. It also requires source code of the application being tested but it was designed for programmers of the applications who have access.
SynchFinder uses the static approach to finding ad hoc synchronizations by analyzing source code. A dynamic approach that uses run-time traces would be more accurate, but would carry a heavier computational load and would require a thorough run through of all possible test cases.

===Conclusion===
The paper's extensive examination of the previously unstudied ad hoc synchronizations concludes that they are prevalent in today's concurrent software, problematic and should be avoided. The basis of their study is well supported with diverse programs, but could always have used more.

SynchFinder is an effective tool for discovering ad hoc synchronizations with a high success rate and minimal requirements and bolster existing tools efforts at detecting bugs.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

Talk:COMP 3000 Essay 2 2010 Question 7

2010-12-02T23:10:07Z

Smcilroy:

'''Attendence. Please mark your name to say you're here'''

Stephany Lay --[[User:Slay|Slay]] 19:45, 15 November 2010 (UTC)

--[[User:Asoknack|Asoknack]] 16:00, 15 November 2010 (UTC)

--[[User:Smcilroy|Smcilroy]] 18:43, 15 November 2010 (UTC)

Lester Mundt
--[[User:Lmundt|Lmundt]] 18:58, 15 November 2010 (UTC)

Thomas McMahon
--[[User:cha0s|cha0s]]

Martin Kugler
--[[User:Mkugler|Mkugler]] 02:42, 18 November 2010 (UTC)

So only 2 days left before it's due! I'd like to hear what others are planning on contributing and when, since only myself and Slay have done any work on the actual essay. It would be nice to hear from you guys, even if its to say that your busy and are going to work on it on Thursday etc. etc.--[[User:Smcilroy|Smcilroy]] 22:05, 23 November 2010 (UTC)

So, what's the plan for this? How do we want to do this? Given the duedate looming in 4 days we should probably get talking about it. What are your thoughts?
--[[User:Mkugler|Mkugler]] 23:00, 21 November 2010 (UTC)
::Hey! So I added an intro and it covers the research problem to. I also added a bare bones skeleton of what we'll probably end up talking about. Personally, I think people should choose a couple of subsections and sections that they will work on and just put your name beside it. It seems to work best as people then have a sense of ownership and responsibility to that particular topic and aren't overwhelmed at editing the entire essay, at least for the beginning. That doesn't mean we can't all edit each others work though! I welcome any changes to the intro. Anyways, that's my 2 cents. --[[User:Smcilroy|Smcilroy]] 06:06, 22 November 2010 (UTC)
::As critique is our own opinion, we should discuss our thoughts. Having only one person writing that would not express everyone's opinions. --[[User:Slay|Slay]] 20:48, 22 November 2010 (UTC)

Hey, gonna be fleshing out the Contribution section, probably the stuff under the SyncFinder heading. Just wanted to give people a head's up so we don't waste time all doing the same thing.
--[[User:Mkugler|Mkugler]] 03:09, 24 November 2010 (UTC)

I will do the Findings section for now then. I'm unsure when I'll get to it though. I'll try for this weekend. --[[User:Slay|Slay]] 00:14, 26 November 2010 (UTC)

Not much activity going on. Please at least claim a section to work on so we know it will get covered. I'll continue to work on my part when I get the time. --[[User:Slay|Slay]] 00:28, 28 November 2010 (UTC)

I agree we should all voice our opinions on the critique section everyone could add there own opinion and someone could synthesize it later. --[[User:Lmundt|Lmundt]] 18:06, 28 November 2010 (UTC)

Yo, I'll be doing some contributions tomorrow, adding to the sync finder and critique section. If you guys want to meet up at any point during the week let me know, I'm good for whenever.
--[[User:cha0s|cha0s]] 1:12, 29 November 2010 (EDT)

I'm already working on the SyncFinder section. I'm gonna try to finish it up tonight.
--[[User:Mkugler|Mkugler]] 02:50, 30 November 2010 (UTC)

Yo, I'll be waking up at 11pm, so i'll be adding stuff at midnight. I'll be able to edit til he locks the page so I'll be able to finalize the paper and make sure everything is editted and flows.
--[[User:cha0s|cha0s]] 11:46, 2 December 2010 (EDT)

I'll be working on the Critique section now and add a few more points.
--[[User:Smcilroy|Smcilroy]] 23:10, 2 December 2010 (UTC)

COMP 3000 Essay 2 2010 Question 7

2010-12-02T01:27:09Z

Smcilroy:

==Paper==
===[http://homeostasis.scs.carleton.ca/wiki/index.php/COMP_3000_Essay_2_2010_Question_7 Ad Hoc Synchronization Considered Harmful]===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel

==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

The paper we are discussing addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==

===Concurrent Programming===
Concurrent programming is a style of programming where multiple threads of execution run concurrently to perform a single task. The thread of execution share a number of resources. Particularly in multi-core processor system and in distributed environment this style of programming can result in significant performance gains. One significant challenge of concurrent programming is coordinating the different threads of execution, this is usually done using synchronization primitives.

===Synchronization Primitives===
Synchronization primitives represent some of the basic tools offered by the system containing the concurrent program to facilitate coordination between of threads of execution. They are generally used to synchronize between threads, and to protect shared resource[[#Foot8|8]].
Some type of synchronization primitives common to many are locks, mutexes, semaphores, and monitors.
Locks are really a superset of synchronization primitives since mutexes, semaphores, and monitors are all locks. Additionally there are read/write locks that only lock when a reader obtains the lock. Latches a type of lock that unlocks when a specified number of threads have obtained it which is very useful in facilitate threads all getting to known state, there are a myriad of other locks.
Mutexes are mutually exclusive locks that threads employ to lock a resource that they need preventing other. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Monitors are a type of mutex that contain a condition variables which is a variable that if not true, releases the lock and blocks the thread until the certain condition is met, by another thread changing value of the variable. This allows the original thread to only continue execute when it is safe to perform its operation.
Synchronization primitives can be misused and lead to a host of other problems generally referred to collectively as race conditions.
===Race conditions, deadlocks===
Race conditions are an unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them have write privilege. This leads to processes modifying the data that all processes share as other may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times very difficult to detect or manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.
===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.

==Contribution==

With concurrent programming commonly used in modern applications, we face many issues that result from having simultaneous execution. In order to maintain a concurrent system, synchronization is required to ensure that the executing tasks do not interfere with each other avoiding potential race conditions. However, many programmers do not use proper synchronization primitives to deal with these issues. Rather, they implement synchronizations in an ad hoc fashion. The paper we are discussing shows that ad hoc synchronizations, though implemented as a solution to concurrency issues, are indeed undesirable in a system. This paper details the characteristics of ad hoc synchronizations and the issues associated with this programming construct and introduces the program, SyncFinder, which is used to identify such synchronizations in code.

===Findings===

In order to identify the characteristics of ad hoc synchronizations, 12 mainstream programs were examined to find instances of ad hoc synchronizations. These programs were either of server, desktop or scientific type, including Apache, MySQL and Mozilla. Through manual inspection of the source code, these characteristics of ad hoc synchronizations were found.

1. In all programs studied, it was found that each had numerous ad hoc synchronizations implemented. The number of synchronizations found ranged from 6-83, with server type programs inhabiting the higher portion of the interval. It is likely that programmers use this type of synchronization for two reasons.
* In order to ensure a certain order of execution in the case of a concurrent system, programmers will use ad hoc synchronization to superimpose this order. With traditional synchronization techniques, this can vary between systems. As the order can vary, it is difficult to create a common interface.
* Some synchronization techniques introduce heavy-weight synchronization primitives. As such, programmers will use ad hoc synchronizations to avoid this and supposedly protect performance.

2. Often, it is very hard to identify an ad hoc loop as a synchronization method. They are hard to distinguish from other computational loops and as the implementations are diverse, it is hard to pinpoint them from the code. This makes the system hard to maintain, as other programmers will not be able to identify ad hoc loops implemented by another and debugging programs cannot recognize them as issues.

3. It was found that ad hoc synchronizations often introduce bugs into the system such as deadlocks or hangs. As these are different than those caused by locks and other synchronizations it is hard for detection tools to recognize them if they were not first identified either manually or automatically.

4. As they are not easily recognizable, it is hard for bug detection tools to fix issues presented by ad hoc synchronizations. In fact, it is often the case that these tools either do not find these issues or report them as false positives as the tool is unaware of the "work arounds" put into affect by using ad hoc synchronization. Since they cannot find these problems, it severely impacts the effectiveness of such tools.
This also impacts analysis of performance. Synchronization is quite costly and if a tool cannot recognize the formm of synchronization, a false report is generated and the programmer will not be aware. This may cause poor decisions on the part of the programmer just from the fact that ad hoc synchronizations are hard to identify.

5. The reason ad hoc synchronizations are hard to identify stems from the fact that there is no single way of implementing it. The ways in which ad hoc synchronizations are done are quite diverse and so it is hard to identify just on a few criteria. Some typical characteristics of an ad hoc synchronization follow.
* These loops can contain one or multiple exit conditions. Some or all of these exit conditions may be satisfied by remote threads while others may be satisfied locally.
* Often, exit conditions depend on sync variables, variables that are shared with other tasks
* In some cases, the synchronization does not wait idly and rather performs other computations while checking the sync variable periodically

Despite the dangers of using ad hoc synchronization, programmers continue to use this method. It is found that, in comments, programmers have stated that possibly their implementations are unsafe but proceed to use ad hoc synchronization techniques. The reasoning behind these decisions have already been outlined in point 1. A better practice of synchronization would be to replace ad hoc synchronizations with synchronization primitives, primitives already present in standard POSIX thread libraries. However, it is often difficult to replace ad hoc synchronizations with synchronization primitives and doing this may not fulfill the concerns presented in point 1.

===SyncFinder===
SyncFinder is a tool built and designed by the authors of the paper for the purpose of identifying and annotating instances of ad hoc synchronization in concurrent programs built in C or C++. The main goal of this was to aid programmers in better structuring their code, while simultaneously allowing for other tools to be utilized, recognizing them as synchronizations. It has demonstrated itself to be very effective in this area where other similar tools have failed, as it analyzes the code in a unique way that specifically tracks down sync loops that implement ad hoc synchronization.

====How it works====
There are two possibilities to consider when searching for ad hoc synchronizations. You can either analyze runtime traces via a dynamic method, or analyze the source code in a static method. Both methods carry with them a number of pros and cons. While a dynamic process is generally more accurate than a static method, it tends to accrue a very large runtime overhead. In addition to this, the dynamic method is somewhat limited in which ad hoc synchronizations it can find by the code coverage of the test cases. Taking these factors into consideration, the authors of the paper opted to pursue a static solution for achieving the goals set out for SyncFinder. SyncFinder uses the LLVM compiler infrastructure.

1. Find Loops

An important commonality between all ad hoc synchronizations is that they are all caused by loops, be they “for”, “while” or “go to” loops. These are generally referred to as "sync loops". The first step in identifying sync loops is The LLVM infrastructure is used to obtain the loop info from the source including a representation of the exit conditions.

2. Identify Sync Loops

The next and most important step is to differentiate between sync loops used for ad hoc synchronization and regular computational loops. It does this by going through the following steps:
* Exit Dependent Variable (EDV) Analysis: EDVs are variables that affect the exit conditions of a loop. A sync variable is a variable related to the synchronization of concurrent programs. Therefore, by identifying any EDVs as sync variables, it can be concluded that the loop is a sync loop.
* Pruning Computational Loops: If a loop has at least one sync condition, it is considered a sync loop. Otherwise, it is pruned out as a computational loop.
* Pruning Condvar Loops: condvar loops are not considered sync loops. SyncFinder will go through all loop candidates and prune out any that make a calls cond_wait inside the loop.

3. Synchronization Pairing

The next step is to find the remote update that would release the sync loop. SyncFinder first finds all write instructions that would modify the sync variables. It then decides if the value that the write assigns to the sync variable would satisfy the exit condition. All those that do not are pruned. SyncFinder also prunes pairings that do not execute concurrently. This is done conservatively due to the limitations of static analysis.

4. SyncFinder Annotation

Finally, SyncFinder annotates these ad hoc synchronizations in a specific way so that other tools are able to find them.

====Uses====
SyncFinder is a robust tool that can be utilized in a variety of applications such as bug detection, performance profiling and concurrency testing. Using its auto-annotation feature, it is capable of identifying sections of code that demonstrate bad programming practices, which could in turn cause issues such as deadlocks. In addition to this, the authors of the paper were able to expand upon the existing data race detector “Valgrind” in order to take advantage of the annotation system introduced by SyncFinder. Through this, they were able to reduce the number of false positives flagged by the former, while being able to make use of the information it provides.

====Accuracy====
SyncFinder was tested against 25 concurrent programs that are used across a broad cross-section of application. In testing SyncFinder was able to positively identify 96% of ad hoc synchronizations within the tested programs. False positives were at a rate of only 6%. In further tests, they were able to utilize SyncFinder’s auto-annotation systems to locate and mark 5 deadlocks and 16 potential issues within Apache, MySQL and Mozilla, that had previously been missed by other analysis tools.

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
1. Uses static approach, dynamic would be better
* As stated in the paper, dynamic introduces run-time overhead and is not guaranteed to find if not executed in the test cases
Dynamic would potentially make it language agnostic.

2. not entirely accurate, some false positives.

3. In terms of style, lots of unnecessary repetition

This paper successfully identifies a new type programming construct ad hoc synchronization. The paper then refers to a body of data that the authors have created that both provides proof of the criteria to identify the construct and illustrates it's frequency and likelihood to introduce bugs. Previous to SyncFinder, debugging tools have for the most part failed to detect ad hoc synchronizations effectively. Because of this, SyncFinder has been shown to be an important tool for future development of software.

As far as shortcomings go, it has a 96% success rate, with 6% false positives. The 6% false positive rate is more or less unavoidable, so it is hard to fault the tool for that, especially with such a high success rate. Even with a 6% false positive rate, the developer still only needs to look through a select few loops to determine which are actually ad hoc synchronizations as opposed to a whole code base.

Finally the tool that the authors have created has been used on ubiquitous applications[[#Foot10|10]] like Apache and exposed previously unreported bugs. This stands as a testament to both it's effectiveness and validity.

==References==
1 M Herlihy and J.E.B. Moss, 2NA0. Transactional Memory:
Architectural Support for Lock-Free Data Structures. [online] Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 C Flanagan and S N Freund, 2NA0. Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs (Summary). [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 T Ball,M Musuvathi and S Qadeer, 2NA0. CHESS: A Systematic Testing Tool for Concurrent. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Park, Lu and Zhou, 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. [online] University of Illinois at Urbana Champaign, Urbana, Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Z Li, L Tan, X Wang, S Lu, Y Zhou, 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. Proc. of 1st Workshop on Architectural and System Support for Improving Software Dependability p.25-33 Available through CiteSeerX: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.6982> [Accessed 23 November 2010].

7 Lu, Park, Seo and Zhou, 2010. Learning from Mistakes A Comprehensive Study on Real
World Concurrency Bug Characteristics. [online] University of Illinois at Urbana Champaign, Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

8 John H. Baldwin , 2002. Locking in the Multithreaded FreeBSD Kernel. [online] FreeBSD Available at: <http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html> [Accessed 23 November 2010].

9 Soma-notes, 2010. Basic Synchronization Principles. [online] Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

10 BuiltWith, 2010. Apache Usage Statistics. [online] Available at: <http://trends.builtwith.com/Web-Server/Apache> [Accessed 30 November 2010].

Talk:COMP 3000 Essay 2 2010 Question 7

2010-11-23T22:05:20Z

Smcilroy:

COMP 3000 Essay 2 2010 Question 7

2010-11-23T21:53:16Z

Smcilroy: Added Background Concepts Section

==Paper==
===Ad Hoc Synchronization Considered Harmful===
Weiwei Xiong
University of California, San Diego

Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou
University of Illinois at Urbana-Champaign

Zhiqiang Ma
Intel
==Research Problem==
As the computer industry continues to shift towards multicore processors, concurrent programming and the use of multithreaded designs has increased to keep up with this growing trend. Multithreaded applications can be found in a variety of popular applications today as they take advantage of the multithreaded approach. However, the concepts behind concurrent programming bring with them a host of potential dangers in the form of race conditions and deadlocks resulting from bad programming design and threads accessing shared memory. Fortunately, there are well known and standard methods for dealing with these problems, i.e synchronization primitives. But in real world situations, due to a variety of reasons, as we shall see, programmers often implement their own "ad hoc" synchronizations that eschew common design standards. Ad hoc synchronizations are not well documented and are not discovered by traditional tools for race conditions that look for standard synchronization primitives.

This paper addresses these concerns in two regards, first it details a thorough study of ad hoc synchronizations. It details their nature, dangers, impact on bug detection tools and prevalence in several major open-source applications. Secondly, it introduces SyncFinder, a program that detects all ad hoc synchronizations and automatically annotates the source code where ad hoc sychnronizations are found. This can see use in conjunction with other data race checkers to improve accuracy and to build custom tools for finding deadlocks and bad programming practices.

With detailed analysis of ad hoc synchronization and study of their occurrences in several applications, the research ultimately concludes that they are harmful and should be removed. At the same time, SyncFinder detects and documents ad hoc synchronizations in the source code enabling programmers for the first time to easily track and remove them.

==Background Concepts==
===Race conditions, deadlocks===
Race conditions are an unintended side-effects of programming in concurrent systems, they occur when two or more processes have access to a shared resource and at least one of them have write privilege. This leads to processes modifying the data that all processes share as other may be reading them and results in the reading of stale/incorrect data. They will occur during the execution of the program and often times very difficult to detect or manipulate data in subtle ways.

Deadlock is when two or more processes share a resource and each process is waiting on the other processes to unlock the resource. It becomes a circular chain and no process can continue.

Both these issues occur in concurrent programming and although there are no general solutions for deadlock[[#Foot9|9]], there are suitable methods for dealing with them, and in the case of race conditions, using mutual exclusion locks and synchronization primitives can prevent race conditions. But no programmer is infallible and so there is always the issue of race conditions and deadlocks present in production code.
===Ad Hoc Synchronization===
Ad hoc synchronizations are loops called sync loops that continue until certain conditions are met via outside variables called sync variables. They are designed to control the flow of thread execution much like locking and unlocking of resources. There can be multiple sync variables in a sync loop and they can have multiple exit conditions and dependencies. The diversity of the sync loops, their dependencies and execution paths leads to difficulty in finding them.
===Synchronization primitives===
Synchronization variables act as barriers to memory that prevent threads from accessing the same shared resource concurrently[[#Foot8|8]]. They come in many forms such as mutexes and condition variables.
Mutexes are mutual exclusive locks that threads employ to lock a resource that they need. No other threads can access them at that point. Once they are finished, they release the lock and the other threads can then lock and access the resource.
Condition variables are variables that will block the thread until a certain condition is met. This allows the thread to only execute when it is safe to perform its operation.

==Contribution==
Intro to the study of the major applications and what they found

===Findings===
1.they are prevalent, all applications had them

2. hard to find

3. error prone

4.effect other bug detection

5. They are diverse. Different forms, multiple exits and dependencies

Reasons behind why people use ad hoc synchronization and possible improvements over them ie Synchronization primitives

===SyncFinder===
Intro to what it is and what it does
====How it works====
1. find loops

2. identify sync loops

3. EDV analysis

4. Pruning

5. Annotation of found sync

====Uses====
1. A tool to detect bad practices

2. Extensions to data race detection

====Accuracy====

====Related Work and similar tools====
There has been attempts to remove synchronization issues entirely from concurrent programming, such as transactional memory[[#Foot1|1]], a lock-free synchronization that does not require mutexes, and avoids having to use lock, unlock operations. Other attempts have been made to remove bugs that would otherwise be safe from data races but are are still at risk of unintended effects from thread interactions, such as Atomizer[[#Foot2|2]], a dynamic atomicity checker.

There are tools that detect data races such as CHESS[[#Foot3|3]], a dynamic data race checker that runs through all possible thread execution paths and CTrigger[[#Foot4|4]], a tool that checks for atomicity violations. The problem with these programs is that they only look for standard synchronization methods and structures, such as lock() and cond_wait(). They are not looking for ad hoc synchronizations.

A similar tool to SyncFinder exists that can detect simple spinning, also an ad hoc synchronization[[#Foot5|5]], but it only detects simple spinning and not the more complicated ad hoc variations.

Several studies on bug characteristics[[#Foot6|6]] and concurrency bugs[[#Foot7|7]] have been composed. This paper complements these studies to better understand the nature of ad hoc synchronizations and their occurrence in concurrent programs.

==Critique==
1. Uses static approach, dynamic would be better
* As stated in the paper, dynamic introduces run-time overhead and is not guaranteed to find if not executed in the test cases

2. not entirely accurate, some false positives.

3. In terms of style, lots of unnecessary repetition

==References==
1 Author, 2010. Title. [online] Company(optional) Available at: <http://www.cs.brown.edu/~mph/HerlihyM93/herlihy93transactional.pdf> [Accessed 23 November 2010].

2 Author, 2010. Title. [online] Company(optional) Available at: <http://www.cs.williams.edu/~freund/papers/atomizer-padtad.pdf> [Accessed 23 November 2010].

3 Author, 2010. Title. [online] Company(optional) Available at: <http://research.microsoft.com/pubs/70509/tr-2007-149.pdf> [Accessed 23 November 2010].

4 Author, 2010. Title. [online] Company(optional) Available at: <http://pages.cs.wisc.edu/~shanlu/paper/asplos092-zhou.pdf> [Accessed 23 November 2010].

5 LI, T., LEBECK, A. R., AND SORIN, D. J. Spin detection hardware for improved management of multithreaded systems. IEEE Transactions on Parallel and Distributed Systems PDS-17, 6 (June 2006), 508–521.

6 Author, 2010. Title. [online] Company(optional) Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf> [Accessed 23 November 2010].

7 Author, 2010. Title. [online] Company(optional) Available at: <[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.1203&rep=rep1&type=pdf]> [Accessed 23 November 2010].

8 Author, 2010. Title. [online] Company(optional) Available at: <[http://www.usenix.org/events/bsdcon/full_papers/baldwin/baldwin_html/node5.html]> [Accessed 23 November 2010].

9 Author, 2010. Title. [online] Company(optional) Available at: <http://homeostasis.scs.carleton.ca/wiki/index.php/Basic_Synchronization_Principles> [Accessed 23 November 2010].

COMP 3000 Essay 2 2010 Question 7

2010-11-23T19:52:20Z

Smcilroy: Added Related Work and similar tools section with references

Talk:COMP 3000 Essay 2 2010 Question 7

2010-11-22T06:06:44Z

Smcilroy:

COMP 3000 Essay 2 2010 Question 7

2010-11-22T05:58:33Z

Smcilroy: Added Intro/Research Problem and possible essay skeleton

Talk:COMP 3000 Essay 2 2010 Question 7

2010-11-15T18:43:47Z

Smcilroy:

'''Attendence. Please mark your name to say you're here'''

Stephany Lay

--[[User:Asoknack|Asoknack]] 16:00, 15 November 2010 (UTC)

--[[User:Smcilroy|Smcilroy]] 18:43, 15 November 2010 (UTC)

Talk:COMP 3000 Essay 2 2010 Question 9

2010-11-15T17:03:00Z

Smcilroy:

Group members:

* Munther Hussain
* Jonathon Slonosky
* Michael Bingham
* Smcilroy

---------------

Hey there, this is Munther. The prof said that we should be contacting each other to see whos still on board for the course. So please
if you read this, add your name to the list of members above. You can my find my contact info in my profile page by clicking my signature. We shall talk about the details and how we will approach this in the next few days --[[User:Hesperus|Hesperus]] 16:41, 12 November 2010 (UTC)

---------------------

Checked in -- JSlonosky

----------------------

Pawel has already contacted us so he still in for the course, that makes 3 of us. The other three members, please drop in and add your name. We need to confirm the members today by 1:00 pm. --[[User:Hesperus|Hesperus]] 12:18, 15 November 2010 (UTC)

----------------------

Checked in --[[User:Mbingham|Mbingham]] 15:08, 15 November 2010 (UTC)

---------------------

Checked in --[[User:Smcilroy|Smcilroy]] 17:03, 15 November 2010 (UTC)

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-15T02:30:01Z

Smcilroy:

== Last minute changes ==
Ok guys, so its due early tomorrow. We have the essay pretty much completed aside from a few things.

First. Are we getting rid of the headings? Other groups have them in at the moment, I know the prof said the essay should read as if they weren't there but it might not hurt for them to be there.

Second. The essay needs to flow better. Some intro and outro sentences acknowledging the next section and refering to the previous ones would be nice.

Otherwise, what else remains?
--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

I'm trying to cleanup the references, is this format acceptable? --[[User:Dagar|Dagar]] 23:45, 14 October 2010 (UTC)
: Yes, that looks alot better --[[User:Smcilroy|Smcilroy]] 00:34, 15 October 2010 (UTC)

::I think we can keep some of the main headings, but I don't think we need them all. I think the real meat of the essay is in the comparisons with networked storage like NAS and especially SAN, so those sections should probably have headings of some kind. I also agree on the flow needing some work, some of the sections have a bit of overlap.

::Anil had mentioned to me today an example of a networked file system based on object store devices - [http://ceph.newdream.net/about/ Ceph]. [http://www.usenix.org/events/osdi06/tech/full_papers/weil/weil_html/ here is the full paper] on the system. I was thinking it might be worth it to mention it at least, maybe even have a small section about it, just so we get in a real world example of this technology. What do you guys think?

::--[[User:Mbingham|Mbingham]] 01:56, 15 October 2010 (UTC)

::Heres a quick example section, I know this is pretty last minute but what do you guys think?

::Ceph is an example of a real world networked storage system based around OSDs. The Ceph developers specifically list performance, reliability, and scalability as the benefits their system offers over current solutions. (insert reference to paper) Since Ceph is based on OSDs, it takes advantage of the ability for clients to interact directly with the devices, which avoids the traditional bottlenecks to performance caused by SAN controllers or NAS heads. This direct access allows Ceph to support a very large number of clients concurrently accessing data on the system. Since objects have security controls it can allow this direct access safely, unlike other network storage architectures.

::--[[User:Mbingham|Mbingham]] 02:09, 15 October 2010 (UTC)

::Also (sorry for all the comments), where does the first sentence of the Security section come from? It sounds like something that should be referenced, and seems kind of out of place because I don't think those four "quadrants" are brought up again?

::--[[User:Mbingham|Mbingham]] 02:11, 15 October 2010 (UTC)

::: Ok if Anil mentioned it, it's probably a good idea to include it, maybe after the 3 comparisons. I got an email back from Anil and he said that headings are OK as long as they add to the essay. So I think we can leave them in. --[[User:Smcilroy|Smcilroy]] 02:30, 15 October 2010 (UTC)

== Tightening up the Intro ==
Hey everyone,

I think it might be useful to re-work the intro a bit so that it better represents the direction the essay has taken since then. Heres a quick mockup of a reworked intro. It could be expanded on in some parts and worked on, etc. I would like any comments, if you guys think this better represents the essay, or what you think needs changing in the introduction. Here it is:

:Storage needs have evolved over the past 60 years, and as a result the functionality expected from filesystems and storage solutions has evolved as well. The low level interface that a storage device implements, however, has remained mostly the same. A block based interface is still the most common mechanism for accessing storage devices. Recently, however, especially with the growth of networked storage architectures such as NAS and SAN, this interface needs to be reworked to accomodate changing needs. Object based storage is increasingly becoming an attractive alternative to block based storage. The design of object based storage devices (OSD), which store objects rather than blocks, easily associates data with meta-data. Objects are created, destroyed, read to, and written from, as well as carrying a unique ID. The device itself manages the physical space and can handle security on a per-object level. A storage network which is based on OSDs can provide better scalability without bottlenecks, better security with per-object access controls, and better integrity with unique has keys. In this way, the OSD interface is looking increasingly attractive as a building block for filesystems, especially in the context of netwoked storage.

I think the main thing is that it brings up networked storage earlier and puts a bit more focus on it. I think the main arguments for object based storage is its applicability to large storage networks, and the advantages it has over block based architectures. For this reason I think the intro should put a bit more focus on it. Does that make sense? Any comments or suggestions you guys have are welcome.

--[[User:Mbingham|Mbingham]] 21:18, 14 October 2010 (UTC)

:I know what you mean, putting a focus on network storage is a good idea. Let me see if I can add your suggestions to the intro and maybe combine the two.--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

== Wikipedia Sources ==
I think we may want to replace the references to wikipedia with something more authoritative. [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open this massive pdf] from IBM supports the idea that fiber channels are the dominant infrastructure of SANs, but i'm not sure if it mentions how that is changing.

The wikipedia page for LUN masking has [http://www.sansecurity.com/san-security-faq.shtml this] as its reference for the definitions, there's also [http://technet.microsoft.com/en-us/library/cc758640(WS.10).aspx this] microsoft article and [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf this] paper from Hitachi. I'm not sure which of these is most relevant since I just did a quick google search and haven't really read up on LUN masking or zoning, so someone else would probably be better suited to decide which one if any to use.

How does that sound to everyone?

--[[User:Mbingham|Mbingham]] 02:55, 14 October 2010 (UTC)

:I agree, the Wikipedia references need to go. Whoever included those references should be able to find alternate sources from the one's you gave. --[[User:Smcilroy|Smcilroy]] 17:45, 14 October 2010 (UTC)

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

:Sounds like a plan. I'll clean up/expand what I have written and get started with some initial stuff for the object sections. Anyone else is welcome to expand and edit as well.
:--[[User:Mbingham|Mbingham]] 00:44, 14 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham (I added a useful diagram here -Npradhan)
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage -Npradhan
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage -Npradhan
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion -Smcilroy

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

COMP 3000 Essay 1 2010 Question 11

2010-10-15T02:06:10Z

Smcilroy: /* Overview of Block-Based Storage */

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially, business' are increasingly choosing to archive and retain all the data they produce and "store everything, forever"[[#Foot1|1]] is the common mantra of storage administrators. The storage industry has been able to keep up with the increasing demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology.

Innovation in storage technology is especially pertinent to businesses that use network storage. The two dominant technologies of network storage: storage area network (SAN) and network-attached storage (NAS) each with their own benefits and drawbacks would benefit greatly with improvement in storage technology. Improvements that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions would be ideal.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They handle the underlying security, space allocation and basic storage routines.[[#Foot2|2]] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of SAN and NAS.

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950s with the introduction of the IBM 350 disk storage unit.[[#Foot3|3]] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[[#Foot4|4]] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into a block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[[#Foot2|2]]. This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these dated commands.

== Overview of Object-Based Storage ==

Unlike block-based storage, object-based storage research started in 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical memory. These objects also have metadata and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems have changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of today's filesystems, especially networked filesystems, than blocks.

== Changing Storage Needs ==

Storage needs have changed significantly since the first hard disks were developed in the 1950s, and the standardization of the interface in the 1970s. This means that the functionality of storage devices must also change to reflect these needs. Storage has become increasingly networked. Networked storage must deal with several issues. Firstly, the storage architecture must be able to scale to terabytes of data and beyond with many servers and clients while avoiding bottlenecks. The data stored on these networks has also become more sensitive. Personal information, such as financial, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has increased, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in its design. Object based storage is more suited to address these issues by design.

== Comparison of object and block based stores ==
=== Scalability ===
Scalability is very important for large businesses that need to manage large data centers. Managing metadata while ensuring data access speed as the systems grows is paramount.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[[#Foot5|5]] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand offer file systems that are distributed, but provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. This eliminates the bottleneck of NAS. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SANs of the past. Modern SANs can serve a much larger set of users, not all of whom can or should be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[[#Foot6|6]]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device.[[#Foot1|1]] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data.[[#Foot1|1]]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels. [[#Foot7|7]] For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking.[[#Foot8|8]]

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object.[[#Foot9|9]] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[[#Foot10|10]] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attractive choice for storage administrators in the future.

==References==
1 Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

2 C. Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html> [Accessed 13 October 2010].

3 IBM 350 disk storage unit, IBM Archives. [online] IBM Available at : <http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html> [Accessed 14 October 2010].

4 M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

5 TechRepublic Guest Contributor, Foundations of Network Storage, Lesson Two: NAS. [online] Available at <http://articles.techrepublic.com.com/5100-22_11-5841266.html> [Accessed 14 October 2010].

6 Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

7 J. Tate, F. Lucchese, R. Moore. Introduction to Storage Area Networks. [online] Available at <http://www.redbooks.ibm.com/redbooks/pdfs/sg245470.pdf> [Accessed 14 October 2010].

8 H. Yoshida. LUN Security Considerations for Storage Area Networks. [online] Available at <http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf> [Accessed 14 October 2010].

9 M. Factor, D. Nagle, D. Naor, E. Riedel, J.Satran, 2005. The OSD Security Protocol. [online] Available at <http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf> [Accessed 14 October 2010].

10 M. Factor, K. Meth, D. Naor, O. Rodeh, J. Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

COMP 3000 Essay 1 2010 Question 11

2010-10-15T01:44:54Z

Smcilroy: I took parts of changing storage needs and put it in scalability section.

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially, business' are increasingly choosing to archive and retain all the data they produce and "store everything, forever"[[#Foot1|1]] is the common mantra of storage administrators. The storage industry has been able to keep up with the increasing demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology.

Innovation in storage technology is especially pertinent to businesses that use network storage. The two dominant technologies of network storage: storage area network (SAN) and network-attached storage (NAS) each with their own benefits and drawbacks would benefit greatly with improvement in storage technology. Improvements that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions would be ideal.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They handle the underlying security, space allocation and basic storage routines.[[#Foot2|2]] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of SAN and NAS.

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950s with the introduction of the IBM 350 disk storage unit.[[#Foot3|3]] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[[#Foot4|4]] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[[#Foot2|2]]. This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these dated commands.

== Overview of Object-Based Storage ==

Unlike block-based storage, object-based storage research started in 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical memory. These objects also have metadata and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems have changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of today's filesystems, especially networked filesystems, than blocks.

== Changing Storage Needs ==

Storage needs have changed significantly since the first hard disks were developed in the 1950s, and the standardization of the interface in the 1970s. This means that the functionality of storage devices must also change to reflect these needs. Storage has become increasingly networked. Networked storage must deal with several issues. Firstly, the storage architecture must be able to scale to terabytes of data and beyond with many servers and clients while avoiding bottlenecks. The data stored on these networks has also become more sensitive. Personal information, such as financial, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has increased, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in its design. Object based storage is more suited to address these issues by design.

== Comparison of object and block based stores ==
=== Scalability ===
Scalability is very important for large businesses that need to manage large data centers. Managing metadata while ensuring data access speed as the systems grows is paramount.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[[#Foot5|5]] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand offer file systems that are distributed, but provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. This eliminates the bottleneck of NAS. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SANs of the past. Modern SANs can serve a much larger set of users, not all of whom can or should be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[[#Foot6|6]]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device.[[#Foot1|1]] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data.[[#Foot1|1]]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels. [[#Foot7|7]] For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking.[[#Foot8|8]]

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object.[[#Foot9|9]] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[[#Foot10|10]] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attractive choice for storage administrators in the future.

==References==
1 Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

2 C. Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html> [Accessed 13 October 2010].

3 IBM 350 disk storage unit, IBM Archives. [online] IBM Available at : <http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html> [Accessed 14 October 2010].

4 M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

5 TechRepublic Guest Contributor, Foundations of Network Storage, Lesson Two: NAS. [online] Available at <http://articles.techrepublic.com.com/5100-22_11-5841266.html> [Accessed 14 October 2010].

6 Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

7 J. Tate, F. Lucchese, R. Moore. Introduction to Storage Area Networks. [online] Available at <http://www.redbooks.ibm.com/redbooks/pdfs/sg245470.pdf> [Accessed 14 October 2010].

8 H. Yoshida. LUN Security Considerations for Storage Area Networks. [online] Available at <http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf> [Accessed 14 October 2010].

9 M. Factor, D. Nagle, D. Naor, E. Riedel, J.Satran, 2005. The OSD Security Protocol. [online] Available at <http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf> [Accessed 14 October 2010].

10 M. Factor, K. Meth, D. Naor, O. Rodeh, J. Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

COMP 3000 Essay 1 2010 Question 11

2010-10-15T00:56:21Z

Smcilroy: /* Scalability */

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially, business' are increasingly choosing to archive and retain all the data they produce and "store everything, forever"[[#Foot1|1]] is the common mantra of storage administrators. The storage industry has been able to keep up with the increasing demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology.

Innovation in storage technology is especially pertinent to businesses that use network storage. The two dominant technologies of network storage: storage area network (SAN) and network-attached storage (NAS) each with their own benefits and drawbacks would benefit greatly with improvement in storage technology. Improvements that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions would be ideal.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They handle the underlying security, space allocation and basic storage routines.[[#Foot2|2]] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of SAN and NAS.

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950s with the introduction of the IBM 350 disk storage unit.[[#Foot3|3]] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[[#Foot4|4]] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[[#Foot2|2]]. This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these dated commands.

== Overview of Object-Based Storage ==

Unlike block-based storage, object-based storage research started in 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical memory. These objects also have metadata and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems have changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of today's filesystems, especially networked filesystems, than blocks.

== Changing Storage Needs ==

Storage needs have changed significantly since the first hard disks were developed in the 1950s, and the standardization of the interface in the 1970s. This means that the functionality of storage devices must also change to reflect these needs. Storage has become increasingly networked. Networked storage must deal with several issues. Firstly, the storage architecture must be able to scale to terabytes of data and beyond with many servers and clients while avoiding bottleneck. The data stored on these networks has also become more sensitive. Personal information, such as financial, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has increased, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in its design. Object based storage is more suited to address these issues by design.

One application where the utility of object stores has become increasingly apparent is in SANs. SAN file systems are distributed, however they provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SANs of the past. Modern SANs can serve a much larger set of users, not all of whom can or should be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[[#Foot5|5]]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

== Comparison of object and block based stores ==
=== Scalability ===
Firstly, scalability is very important for large businesses that need to manage large data centers. Managing metadata while ensuring data access speed as the systems grows is paramount.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[[#Foot6|6]] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device.[[#Foot1|1]] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data.[[#Foot1|1]]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels. [[#Foot7|7]] For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking.[[#Foot8|8]]

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object.[[#Foot9|9]] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[[#Foot10|10]] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attractive choice for storage administrators in the future.

==References==
1 Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

2 C. Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html> [Accessed 13 October 2010].

3 IBM 350 disk storage unit, IBM Archives. [online] IBM Available at : <http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html> [Accessed 14 October 2010].

4 M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

5 Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

6 TechRepublic Guest Contributor, Foundations of Network Storage, Lesson Two: NAS. [online] Available at <http://articles.techrepublic.com.com/5100-22_11-5841266.html> [Accessed 14 October 2010].

7 J. Tate, F. Lucchese, R. Moore. Introduction to Storage Area Networks. [online] Available at <http://www.redbooks.ibm.com/redbooks/pdfs/sg245470.pdf> [Accessed 14 October 2010].

8 H. Yoshida. LUN Security Considerations for Storage Area Networks. [online] Available at <http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf> [Accessed 14 October 2010].

9 M. Factor, D. Nagle, D. Naor, E. Riedel, J.Satran, 2005. The OSD Security Protocol. [online] Available at <http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf> [Accessed 14 October 2010].

10 M. Factor, K. Meth, D. Naor, O. Rodeh, J. Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

COMP 3000 Essay 1 2010 Question 11

2010-10-15T00:51:27Z

Smcilroy: added SAN and NAN introduction earlier in intro

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially, business' are increasingly choosing to archive and retain all the data they produce and "store everything, forever"[[#Foot1|1]] is the common mantra of storage administrators. The storage industry has been able to keep up with the increasing demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology.

Innovation in storage technology is especially pertinent to businesses that use network storage. The two dominant technologies of network storage: storage area network (SAN) and network-attached storage (NAS) each with their own benefits and drawbacks would benefit greatly with improvement in storage technology. Improvements that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions would be ideal.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They handle the underlying security, space allocation and basic storage routines.[[#Foot2|2]] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of SAN and NAS.

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950s with the introduction of the IBM 350 disk storage unit.[[#Foot3|3]] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[[#Foot4|4]] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[[#Foot2|2]]. This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these dated commands.

== Overview of Object-Based Storage ==

Unlike block-based storage, object-based storage research started in 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical memory. These objects also have metadata and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems have changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of today's filesystems, especially networked filesystems, than blocks.

== Changing Storage Needs ==

Storage needs have changed significantly since the first hard disks were developed in the 1950s, and the standardization of the interface in the 1970s. This means that the functionality of storage devices must also change to reflect these needs. Storage has become increasingly networked. Networked storage must deal with several issues. Firstly, the storage architecture must be able to scale to terabytes of data and beyond with many servers and clients while avoiding bottleneck. The data stored on these networks has also become more sensitive. Personal information, such as financial, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has increased, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in its design. Object based storage is more suited to address these issues by design.

One application where the utility of object stores has become increasingly apparent is in SANs. SAN file systems are distributed, however they provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SANs of the past. Modern SANs can serve a much larger set of users, not all of whom can or should be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[[#Foot5|5]]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[[#Foot6|6]] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device.[[#Foot1|1]] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data.[[#Foot1|1]]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels. [[#Foot7|7]] For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking.[[#Foot8|8]]

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object.[[#Foot9|9]] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[[#Foot10|10]] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attractive choice for storage administrators in the future.

==References==
1 Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

2 C. Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html> [Accessed 13 October 2010].

3 IBM 350 disk storage unit, IBM Archives. [online] IBM Available at : <http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html> [Accessed 14 October 2010].

4 M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

5 Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

6 TechRepublic Guest Contributor, Foundations of Network Storage, Lesson Two: NAS. [online] Available at <http://articles.techrepublic.com.com/5100-22_11-5841266.html> [Accessed 14 October 2010].

7 J. Tate, F. Lucchese, R. Moore. Introduction to Storage Area Networks. [online] Available at <http://www.redbooks.ibm.com/redbooks/pdfs/sg245470.pdf> [Accessed 14 October 2010].

8 H. Yoshida. LUN Security Considerations for Storage Area Networks. [online] Available at <http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf> [Accessed 14 October 2010].

9 M. Factor, D. Nagle, D. Naor, E. Riedel, J.Satran, 2005. The OSD Security Protocol. [online] Available at <http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf> [Accessed 14 October 2010].

10 M. Factor, K. Meth, D. Naor, O. Rodeh, J. Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-15T00:34:29Z

Smcilroy:

== Last minute changes ==
Ok guys, so its due early tomorrow. We have the essay pretty much completed aside from a few things.

First. Are we getting rid of the headings? Other groups have them in at the moment, I know the prof said the essay should read as if they weren't there but it might not hurt for them to be there.

Second. The essay needs to flow better. Some intro and outro sentences acknowledging the next section and refering to the previous ones would be nice.

Otherwise, what else remains?
--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

I'm trying to cleanup the references, is this format acceptable? --[[User:Dagar|Dagar]] 23:45, 14 October 2010 (UTC)
: Yes, that looks alot better --[[User:Smcilroy|Smcilroy]] 00:34, 15 October 2010 (UTC)

== Tightening up the Intro ==
Hey everyone,

I think it might be useful to re-work the intro a bit so that it better represents the direction the essay has taken since then. Heres a quick mockup of a reworked intro. It could be expanded on in some parts and worked on, etc. I would like any comments, if you guys think this better represents the essay, or what you think needs changing in the introduction. Here it is:

:Storage needs have evolved over the past 60 years, and as a result the functionality expected from filesystems and storage solutions has evolved as well. The low level interface that a storage device implements, however, has remained mostly the same. A block based interface is still the most common mechanism for accessing storage devices. Recently, however, especially with the growth of networked storage architectures such as NAS and SAN, this interface needs to be reworked to accomodate changing needs. Object based storage is increasingly becoming an attractive alternative to block based storage. The design of object based storage devices (OSD), which store objects rather than blocks, easily associates data with meta-data. Objects are created, destroyed, read to, and written from, as well as carrying a unique ID. The device itself manages the physical space and can handle security on a per-object level. A storage network which is based on OSDs can provide better scalability without bottlenecks, better security with per-object access controls, and better integrity with unique has keys. In this way, the OSD interface is looking increasingly attractive as a building block for filesystems, especially in the context of netwoked storage.

I think the main thing is that it brings up networked storage earlier and puts a bit more focus on it. I think the main arguments for object based storage is its applicability to large storage networks, and the advantages it has over block based architectures. For this reason I think the intro should put a bit more focus on it. Does that make sense? Any comments or suggestions you guys have are welcome.

--[[User:Mbingham|Mbingham]] 21:18, 14 October 2010 (UTC)

:I know what you mean, putting a focus on network storage is a good idea. Let me see if I can add your suggestions to the intro and maybe combine the two.--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

== Wikipedia Sources ==
I think we may want to replace the references to wikipedia with something more authoritative. [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open this massive pdf] from IBM supports the idea that fiber channels are the dominant infrastructure of SANs, but i'm not sure if it mentions how that is changing.

The wikipedia page for LUN masking has [http://www.sansecurity.com/san-security-faq.shtml this] as its reference for the definitions, there's also [http://technet.microsoft.com/en-us/library/cc758640(WS.10).aspx this] microsoft article and [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf this] paper from Hitachi. I'm not sure which of these is most relevant since I just did a quick google search and haven't really read up on LUN masking or zoning, so someone else would probably be better suited to decide which one if any to use.

How does that sound to everyone?

--[[User:Mbingham|Mbingham]] 02:55, 14 October 2010 (UTC)

:I agree, the Wikipedia references need to go. Whoever included those references should be able to find alternate sources from the one's you gave. --[[User:Smcilroy|Smcilroy]] 17:45, 14 October 2010 (UTC)

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

:Sounds like a plan. I'll clean up/expand what I have written and get started with some initial stuff for the object sections. Anyone else is welcome to expand and edit as well.
:--[[User:Mbingham|Mbingham]] 00:44, 14 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham (I added a useful diagram here -Npradhan)
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage -Npradhan
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage -Npradhan
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion -Smcilroy

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-14T23:12:39Z

Smcilroy:

== Last minute changes ==
Ok guys, so its due early tomorrow. We have the essay pretty much completed aside from a few things.

First. Are we getting rid of the headings? Other groups have them in at the moment, I know the prof said the essay should read as if they weren't there but it might not hurt for them to be there.

Second. The essay needs to flow better. Some intro and outro sentences acknowledging the next section and refering to the previous ones would be nice.

Otherwise, what else remains?
--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

== Tightening up the Intro ==
Hey everyone,

I think it might be useful to re-work the intro a bit so that it better represents the direction the essay has taken since then. Heres a quick mockup of a reworked intro. It could be expanded on in some parts and worked on, etc. I would like any comments, if you guys think this better represents the essay, or what you think needs changing in the introduction. Here it is:

:Storage needs have evolved over the past 60 years, and as a result the functionality expected from filesystems and storage solutions has evolved as well. The low level interface that a storage device implements, however, has remained mostly the same. A block based interface is still the most common mechanism for accessing storage devices. Recently, however, especially with the growth of networked storage architectures such as NAS and SAN, this interface needs to be reworked to accomodate changing needs. Object based storage is increasingly becoming an attractive alternative to block based storage. The design of object based storage devices (OSD), which store objects rather than blocks, easily associates data with meta-data. Objects are created, destroyed, read to, and written from, as well as carrying a unique ID. The device itself manages the physical space and can handle security on a per-object level. A storage network which is based on OSDs can provide better scalability without bottlenecks, better security with per-object access controls, and better integrity with unique has keys. In this way, the OSD interface is looking increasingly attractive as a building block for filesystems, especially in the context of netwoked storage.

I think the main thing is that it brings up networked storage earlier and puts a bit more focus on it. I think the main arguments for object based storage is its applicability to large storage networks, and the advantages it has over block based architectures. For this reason I think the intro should put a bit more focus on it. Does that make sense? Any comments or suggestions you guys have are welcome.

--[[User:Mbingham|Mbingham]] 21:18, 14 October 2010 (UTC)

:I know what you mean, putting a focus on network storage is a good idea. Let me see if I can add your suggestions to the intro and maybe combine the two.--[[User:Smcilroy|Smcilroy]] 23:12, 14 October 2010 (UTC)

== Wikipedia Sources ==
I think we may want to replace the references to wikipedia with something more authoritative. [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open this massive pdf] from IBM supports the idea that fiber channels are the dominant infrastructure of SANs, but i'm not sure if it mentions how that is changing.

The wikipedia page for LUN masking has [http://www.sansecurity.com/san-security-faq.shtml this] as its reference for the definitions, there's also [http://technet.microsoft.com/en-us/library/cc758640(WS.10).aspx this] microsoft article and [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf this] paper from Hitachi. I'm not sure which of these is most relevant since I just did a quick google search and haven't really read up on LUN masking or zoning, so someone else would probably be better suited to decide which one if any to use.

How does that sound to everyone?

--[[User:Mbingham|Mbingham]] 02:55, 14 October 2010 (UTC)

:I agree, the Wikipedia references need to go. Whoever included those references should be able to find alternate sources from the one's you gave. --[[User:Smcilroy|Smcilroy]] 17:45, 14 October 2010 (UTC)

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

:Sounds like a plan. I'll clean up/expand what I have written and get started with some initial stuff for the object sections. Anyone else is welcome to expand and edit as well.
:--[[User:Mbingham|Mbingham]] 00:44, 14 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham (I added a useful diagram here -Npradhan)
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage -Npradhan
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage -Npradhan
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion -Smcilroy

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

COMP 3000 Essay 1 2010 Question 11

2010-10-14T21:25:56Z

Smcilroy: /* Changing Storage Needs */

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Overview of Object-Based Storage ==
'''Anyone feel free to expand on this section'''

Unlike block-based storage, whose design reaches back to the 1950s, object-based storage research goes back to the 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical memory. These objects also have meta-data and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems has changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of todays filesystems, especially networked filesystems, than blocks.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. This means that the functionality of storage devices must also change to reflect these needs. Storage has become increasingly networked. Networked storage must deal with several issues. Firstly, the storage architecture must be able to scale to many terabytes of data or more with many servers and clients without getting bottlenecked. The data stored on these networks has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in it's design. Object based storage is more suited to address these issues because of how it has been designed.

One application where the utility of object stores has become increasingly apparent is in SAN systems. SAN file systems are distributed, however they provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SAN networks of the past. Modern SAN networks can serve a much larger set of users, not all of whom can be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels. [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attractive choice for storage administrators in the future.

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

[7] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[8] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[9] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[10] [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open Storage Area Network]

[11] [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf Fibre Channel zoning]

[12] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

[13] Michael Factor, Kalman Meth, Dalit Naor, Ohad Rodeh, Julian Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] IBM Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-14T17:45:02Z

Smcilroy:

== Wikipedia Sources ==
I think we may want to replace the references to wikipedia with something more authoritative. [http://www.redbooks.ibm.com/abstracts/sg245470.html?Open this massive pdf] from IBM supports the idea that fiber channels are the dominant infrastructure of SANs, but i'm not sure if it mentions how that is changing.

The wikipedia page for LUN masking has [http://www.sansecurity.com/san-security-faq.shtml this] as its reference for the definitions, there's also [http://technet.microsoft.com/en-us/library/cc758640(WS.10).aspx this] microsoft article and [http://www.it.hds.com/pdf/wp91_san_lun_secur.pdf this] paper from Hitachi. I'm not sure which of these is most relevant since I just did a quick google search and haven't really read up on LUN masking or zoning, so someone else would probably be better suited to decide which one if any to use.

How does that sound to everyone?

--[[User:Mbingham|Mbingham]] 02:55, 14 October 2010 (UTC)

:I agree, the Wikipedia references need to go. Whoever included those references should be able to find alternate sources from the one's you gave. --[[User:Smcilroy|Smcilroy]] 17:45, 14 October 2010 (UTC)

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

:Sounds like a plan. I'll clean up/expand what I have written and get started with some initial stuff for the object sections. Anyone else is welcome to expand and edit as well.
:--[[User:Mbingham|Mbingham]] 00:44, 14 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham (I added a useful diagram here -Npradhan)
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage -Npradhan
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage -Npradhan
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion -Smcilroy

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

COMP 3000 Essay 1 2010 Question 11

2010-10-14T17:40:06Z

Smcilroy: /* Introduction */

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access, ensured integrity of data with unique hash key's and benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Overview of Object-Based Storage ==
'''Anyone feel free to expand on this section'''

Unlike block-based storage, whose design reaches back to the 1950s, object-based storage research goes back to the 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical blocks of memory. These objects also have meta-data and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems has changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of todays filesystems than blocks.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. This means that the functionality of storage devices must also change to reflect these needs. Firstly, the scale of data being stored, both personally and by organizations, has gone up by orders of magnitude. Today personal hard drives routinely store terabytes of data, massive networks store even more. In fact, "a survey of over one thousand ASNP members indicates that 20% of them manage over 100 terabytes of data" (Seagate Research, 2005).[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf] Data has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in it's design. Object based storage is more suited to address these issues because of how it has been designed.

One application where the utility of object stores has become increasingly apparent is in SAN (Storage Area Network) systems. SAN file systems are distributed, however they provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SAN networks of the past. Modern SAN networks can serve a much larger set of users, not all of whom can be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels, although this is a trend that is changing. [http://en.wikipedia.org/wiki/Storage_area_network] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://en.wikipedia.org/wiki/Fibre_Channel_zoning] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attracted choice for storage administrators in the future.

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] [http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf Seagate]

[7] Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

[8] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[9] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[10] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[11] [http://en.wikipedia.org/wiki/Storage_area_network Storage Area Network]

[12] [http://en.wikipedia.org/wiki/Fibre_Channel_zoning Fibre Channel zoning]

[13] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

[14] Michael Factor, Kalman Meth, Dalit Naor, Ohad Rodeh, Julian Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] IBM Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

COMP 3000 Essay 1 2010 Question 11

2010-10-14T17:36:05Z

Smcilroy: /* References */

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access and ensured integrity of data with unique hash key's for each object along due to some benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Overview of Object-Based Storage ==
'''Anyone feel free to expand on this section'''

Unlike block-based storage, whose design reaches back to the 1950s, object-based storage research goes back to the 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical blocks of memory. These objects also have meta-data and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems has changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of todays filesystems than blocks.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. This means that the functionality of storage devices must also change to reflect these needs. Firstly, the scale of data being stored, both personally and by organizations, has gone up by orders of magnitude. Today personal hard drives routinely store terabytes of data, massive networks store even more. In fact, "a survey of over one thousand ASNP members indicates that 20% of them manage over 100 terabytes of data" (Seagate Research, 2005).[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf] Data has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in it's design. Object based storage is more suited to address these issues because of how it has been designed.

One application where the utility of object stores has become increasingly apparent is in SAN (Storage Area Network) systems. SAN file systems are distributed, however they provide a single system image of the file system. This means that a local user need not be concerned with where the data is physically stored, since a level of abstraction separates the user from the physical location of the data. In the past, SANs were implemented on private fiber channel networks, which were designed to emulate local storage media. As long as the network remained exclusive, it could be assumed that all the clients could be trusted, so security was not a primary concern. The lack of security concern is one of the main reasons that block storage was a viable option for SAN networks of the past. Modern SAN networks can serve a much larger set of users, not all of whom can be trusted. This, in addition to the possible adoption of IP based SAN solutions, make data security a primary concern[http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf]. Object stores can make user privilege management a much more manageable task, since each object can 'know' who is allowed to access it.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels, although this is a trend that is changing. [http://en.wikipedia.org/wiki/Storage_area_network] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://en.wikipedia.org/wiki/Fibre_Channel_zoning] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attracted choice for storage administrators in the future.

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] [http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf Seagate]

[7] Satran and Teperman, Object Store Based SAN File Systems. [online] IBM Labs Available at: <http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf> [Accessed 14 October 2010].

[8] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[9] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[10] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[11] [http://en.wikipedia.org/wiki/Storage_area_network Storage Area Network]

[12] [http://en.wikipedia.org/wiki/Fibre_Channel_zoning Fibre Channel zoning]

[13] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

[14] Michael Factor, Kalman Meth, Dalit Naor, Ohad Rodeh, Julian Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] IBM Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-14T02:36:34Z

Smcilroy:

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

:Sounds like a plan. I'll clean up/expand what I have written and get started with some initial stuff for the object sections. Anyone else is welcome to expand and edit as well.
:--[[User:Mbingham|Mbingham]] 00:44, 14 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion -Smcilroy

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

COMP 3000 Essay 1 2010 Question 11

2010-10-14T02:33:42Z

Smcilroy: added conclusion

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access and insured integrity of data with unique hash key's for each object along with some benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Overview of Object-Based Storage ==
'''Anyone feel free to expand on this section'''

Unlike block-based storage, whose design reaches back to the 1950s, object-based storage research goes back to the 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical blocks of memory. These objects also have meta-data and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems has changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of todays filesystems than blocks.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. This means that the functionality of storage devices must also change to reflect these needs. Firstly, the scale of data being stored, both personally and by organizations, has gone up by orders of magnitude. Today personal hard drives routinely store terabytes of data, massive networks store even more. In fact, "a survey of over one thousand ASNP members indicates that 20% of them manage over 100 terabytes of data" (Seagate Research, 2005).[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf] Data has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in it's design. Object based storage is more suited to address these issues because of how it has been designed.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels, although this is a trend that is changing. [http://en.wikipedia.org/wiki/Storage_area_network] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://en.wikipedia.org/wiki/Fibre_Channel_zoning] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
Although object storage is relatively new compared to block storage, work as progressed steadily in universities and on standards such as the ANSI T10 SCSI OSD standard. But there remains challenges to its adoption in the industry. One of which, is that it is only needed in high end business solutions at the moment, preventing it from reaching smaller businesses.[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf] But as newer features are added and the standards mature we will see an increased adoption.

It is obvious however that changes do need to occur as storage grows and finer levels of management are needed for data storage. Object-based storage has evolved to fit these needs where block-based storage has stagnated. The better tools for managing the data using the rich metadata of objects, the security and data transfer speeds of NAS and SAN combined and integrity controls for backups and redundancies will be an attracted choice for storage administrators in the future.

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] [http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf Seagate]

[7] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[8] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[9] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[10] [http://en.wikipedia.org/wiki/Storage_area_network Storage Area Network]

[11] [http://en.wikipedia.org/wiki/Fibre_Channel_zoning Fibre Channel zoning]

[12] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

[13] Michael Factor, Kalman Meth, Dalit Naor, Ohad Rodeh, Julian Satran, 2005. Object storage: The future building block for storage systems. In 2nd International IEEE Symposium on Mass Storage Systems and Technologies, Sardinia [online] IBM Available at: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3959&rep=rep1&type=pdf> [Accessed 13 October 2010].

COMP 3000 Essay 1 2010 Question 11

2010-10-14T01:44:38Z

Smcilroy: added a small paragraph to integrity

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Object Based Storage Devices (OSD) solve these issues because of how they are designed. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access and insured integrity of data with unique hash key's for each object along with some benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Overview of Object-Based Storage ==
'''Anyone feel free to expand on this section'''

Unlike block-based storage, whose design reaches back to the 1950s, object-based storage research goes back to the 1990s. See for example the work of Gibson et al in "A Cost-Effective, High-Bandwidth Storage Architecture", Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, 1998. The fundamental idea of an object based storage device is to have the storage device itself handle a layer of abstraction on top of the block. Instead of the interface presenting the filesystem with blocks to read and write to, the interface presents the filesystem with "objects" which it can read to, write to, create, or destroy. Objects can be variable sized, and the device itself handles mapping onto physical blocks of memory. These objects also have meta-data and access controls immediately associated with them. This allows the filesystem to work at a higher level of abstraction. This is important because the needs placed on filesystems has changed, and we will see as we compare object based storage with block based storage that the design of objects are more suited to the needs of todays filesystems than blocks.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. This means that the functionality of storage devices must also change to reflect these needs. Firstly, the scale of data being stored, both personally and by organizations, has gone up by orders of magnitude. Today personal hard drives routinely store terabytes of data, massive networks store even more. In fact, "a survey of over one thousand ASNP members indicates that 20% of them manage over 100 terabytes of data" (Seagate Research, 2005).[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf] Data has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities because of limitations inherent in it's design. Object based storage is more suited to address these issues because of how it has been designed.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

OSDs have another powerful feature. Each object file has an associated hash key that is generated uniquely to the contents of the file. Thus the file can be verified for accuracy to ensure the contents remain the same and integrity to ensure the data has not been corrupted. Also it can be used for management of data to flag duplicate data. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf]

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels, although this is a trend that is changing. [http://en.wikipedia.org/wiki/Storage_area_network] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://en.wikipedia.org/wiki/Fibre_Channel_zoning] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
'''Note: overall conclusions?'''
In conclusion, object based storage devices...

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] [http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf Seagate]

[7] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[8] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[9] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[10] [http://en.wikipedia.org/wiki/Storage_area_network Storage Area Network]

[11] [http://en.wikipedia.org/wiki/Fibre_Channel_zoning Fibre Channel zoning]

[12] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

COMP 3000 Essay 1 2010 Question 11

2010-10-14T00:46:05Z

Smcilroy: fleshed out Scalability section

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=

== Introduction ==

Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever"[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Enter object based storage devices (OSD). Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions.[http://developers.sun.com/solaris/articles/osd.html] This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access and insured integrity of data with unique hash key's for each object along with some benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of storage area network (SAN) and network-attached storage (NAS).

== Overview of Block-Based Storage ==

Hard disks as a storage medium date back to the 1950's with the introduction of the IBM 350 disk storage unit.[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html] Hard disks store data in blocks, which are a fixed length series' of bytes. Since early devices like the IBM 350, the interface that the operating system uses to communicate with the hard disk has remained mostly the same.[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722] This interface simply allows the operating system to read or write to blocks on the disk. This means that the goal of abstracting stored data into related groups or into human-understandable constructs such as objects or files is left completely in the space of the operating system's filesystem. For example, when the filesystem wants to write data to a file it must translate that into what block on the disk to write to. In this way, the scope of a filesystem extends from high level constructs like files to low level constructs like blocks. This wide scope is necessary because of the simple interface presented to the filesystem that must be abstracted up to the complex expectations of a user.

Multiple standards exist to implement this interface. The small computer system interface (SCSI) standards, which have been around in one form or another since the late 1970s, are popular with industry. Parallel ATA, another standard which was designed in the 1980s, continues today in the form of Serial ATA (SATA). However, even though these standards have been around for a long time, "the logical interface, or the command set, has seen only minor additions"[http://developers.sun.com/solaris/articles/osd.html](Bandulet). This means that the functionality that the command set allows has also remained mostly the same, since the functionality must be built on top of these commands.

== Changing Storage Needs ==
'''Note: Just getting the ball rolling on this section. Anyone else is welcome to pick it up and expand'''

Storage needs have changed a lot since the 1950s, when the first hard disks were developed, and the 1970s, when the interface became standardized. Firstly, the scale of data being stored, both personally and by organizations, has gone up by orders of magnitude. Today personal hard drives routinely store terabytes of data, massive networks store even more. In fact, "a survey of over one thousand ASNP members indicates that 20% of them manage over 100 terabytes of data" (Seagate Research, 2005).[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf] Data has also become more sensitive. Personal information, such as credit card numbers and financial information, is stored in large databases. Sensitive corporate and governmental information is stored similarly. Since the value of data has gone up, it becomes more important to ensure the data's integrity and security. Block based storage, as we will see, has difficulty dealing with these priorities. Object based storage is more suited to address these issues because of how it has been designed.

== Comparison of object and block based stores ==
=== Scalability ===
Today's storage systems consist of two main technologies, SAN and NAS storage. They both have their benefits and drawbacks. The key issues being managing metadata and ensuring data access speed as the systems grow.

Most block based storage systems contain many layers of metadata. There are also various types of virtualized systems that contain metadata to deal with device diversity or remapping of blocks for archiving or duplication. Building systems to scale with the metadata becomes a major issue. But at the same time the current speeds of block-based storage needs to be maintained.

NAS is a file system that coordinates the interface between file blocks and the clients access to files. This is done through a single NAS head which usually has thousands of gigabytes of storage behind it.[http://articles.techrepublic.com.com/5100-22_11-5841266.html] All data traffic must flow through this single access point. The benefits of the NAS file system is through its ability to set block access, manage security, prevent unauthorized access to files and use metadata to map blocks into files for the client. However, this causes a bottleneck issue with all the data passing through one point. Another issue is managing the metadata. Metadata is shared among separate metadata servers remote from the hosts. Space allocation management on different storage system layers and applications that add policy and management metadata individually is spread throughout the system. So this results in the metadata becoming very hard to manage.

SAN's on the other hand, allow data access through fiber cables directly accessing the storage. The storage management and file system is connected separately to both the client and the storage, separating the data channel with the management channel and acts as the mediator with the client and the storage blocks. This eliminates the bottleneck. Although SAN filesystems have the benefits of shared access for scalability, coordination of this shared access leads to scalability problems. File systems must coordinate allocation of blocks. For clients to share read-write access, they must coordinate usage of data blocks through metadata. Security also must be addressed as it opens up a host of security issues as the clients must be trusted to access the data.

Object storage provides the ability to operate a SAN setup with direct access to data while offering better security and scalability with metadata. Each object comes with a set of access rules given to it by the management server and metadata is associated and stored directly with each data object and is automatically carried between layers and across devices. Space allocation and management metadata are the responsibility of the storage device. [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf] This allows metadata layers to be folded, reducing server overhead and processing, and allows for larger clusters of storage compared with traditional block-based interfaces.

=== Integrity ===
Block based file systems in archive solutions usually have no built in mechanisms for assuring data integrity. A common best practice is to conduct frequent backups, which adds to the complexity of using file systems for archiving and scalability. The mechanisms for ensuring data integrity in OSDs have mechanisms that operate differently from block store systems.

One of the major problems with storage at the block level is that if there is an error in a block, it is almost impossible to determine what part of the file system is affected. It may be the case that the error in a particular block may not even contain any data. This usually happens during a backup procedure or when a controller is organizing data.

OSDs provide a level of abstraction that hides the fact that a disk device has blocks. It no longer matters to the file system manager what kind of disk drive is being used, it only worries about managing objects. This is done through managing metadata as well as maintaining internal copies of its metadata. Hence, OSDs have knowledge of its object layout even though one or more groups of objects are on different OSDs. In this way OSDs know what kind of space is being used or unused and can scan and correct errors without losing data. In the event of a failure in recovering a file or a number of files, traditional systems may have to do a complete file system restore. However, an OSDs awareness of its object layout enables it to recover data specific to a byte range and thus restore files in an efficient manner.

=== Security ===

Security threats can be thought of as having four quadrants. External, internal, accidental and malicious. Block based stores have a variety of ways for handling security but there are basic concepts that SAN and NAS technologies use to secure data.

SAN has traditionally run on fibre channels, although this is a trend that is changing. [http://en.wikipedia.org/wiki/Storage_area_network] 
For the sake of security, running a SAN on fibre channels help isolate its network as they do not communicate over TCP/IP connections. However, since the SAN devices themselves do not restrict access, it's up to the network infrastructure and host system to handle its security.

Zoning and LUN masking are typical ways SAN systems could use as security measures. Zoning allocates a certain amount of storage to clients. These zones are isolated and are not allowed to communicate outside their respective zone. LUN masking is similar to zoning, however, they differ in the type of devices being used. Switches utilize zoning while disk array controllers use LUN masking. A disk array controller is a device which manages the physical disk drives and interprets them as logical unit numbers. Thus, the term LUN masking. [http://en.wikipedia.org/wiki/Fibre_Channel_zoning] 

NAS has its own vulnerabilities but as with SAN, it is only as secure as the network they operate on. NAS security is conceptually simpler than SAN. NAS environments can administer security tasks as well as control disk usage quotas. The proprietary operating system it runs on has access control configurations much like other traditional OSs that can prevent unauthorized access to data.

Unlike NAS and SAN systems, OSD devices handle security requests directly. The set of protocols used by OSD enable it to cover the four quadrants of security threats outlined above. Clients can access an OSD device by providing "cryptographically secure credentials", called capabilities, which specify a tuple (OSD name, partition ID, object ID) to identify the object. [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf] This can prevent accidental or even malicious access to an OSD externally or internally.

== Conclusion ==
'''Note: overall conclusions?'''
In conclusion, object based storage devices...

==References==

[1] Dell Product Group, 2010. Object Storage A Fresh Approach to Long-Term File Storage. [online] Dell Available at: <http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf> [Accessed 13 October 2010].

[2] Christian Bandulet, 2007. Object-Based Storage Devices. [online] Oracle Available at: <http://developers.sun.com/solaris/articles/osd.html>
[Accessed 13 October 2010].

[3] [http://www-03.ibm.com/ibm/history/exhibits/storage/storage_350.html IBM 350 Disk Storage Unit]

[4] M. Mesnier, G. R. Ganger, and E. Riedel. Object-Based Storage. IEEE Communications Magazine, 41(8), August 2003.

[5] [http://developers.sun.com/solaris/articles/osd.html Object-Based Storage Devices Christian Bandulet, July 2007]

[6] [http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf Seagate]

[7] [http://articles.techrepublic.com.com/5100-22_11-5841266.html Foundations of Network Storage]

[8] [http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Dell Object Storage Overview]

[9] [http://en.wikipedia.org/wiki/Storage_area_network Storage Area Network]

[10] [http://en.wikipedia.org/wiki/Fibre_Channel_zoning Fibre Channel zoning]

[11] [http://www.research.ibm.com/haifa/projects/storage/objectstore/papers/OSDSecurityProtocol.pdf IBM OSD Security Protocol Overview]

COMP 3000 Essay 1 2010 Question 11

2010-10-13T23:35:52Z

Smcilroy:

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-13T22:54:07Z

Smcilroy:

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

I'm going to expand the scalability and integrity sections. Then once the security section is done, I think that just leaves the section on the OSD standard and future plans for the tech. Then in the conclusion we can recap.
--[[User:Smcilroy|Smcilroy]] 22:54, 13 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-13T22:46:27Z

Smcilroy:

== Some Sourcing Issues and Other Stuff ==
Just a reminder, if we're taking direct quotes from a source they need to be in quotation marks and attributed with the authors name and the date (I think) in parenthesis at the end, not just a link or footnote reference. There was an issue with this in the first couple sentences of the scalability section. I've put it in quotes (though I didn't see any authors listed so I just put the company), but I think that that information might be better worked into the "Changing Storage Needs" section, what do you guys think?

Also, I think probably sometime today we should divide the rest of the sections up and try to get most of the content in so we have tomorrow for editing and combining the information so that it flows well. Again, any thoughts?

--[[User:Mbingham|Mbingham]] 19:32, 12 October 2010 (UTC)

: Sorry about the citation issue, you're right. I used the quote to emphasize the fact that scalability issues are evident in disk block systems. But now that I read it, it doesn't really transition well into the second paragraph. I don't mind if you move the quote to another section. Other than that, I could just finish up the section about Security. I don't really know who else is actively contributing to this essay though...or at least don't see anyone volunteering to take a topic other than Mbingham, Smcilroy and myself...
:--[[User:Myagi|Myagi]] 15:47, 12 October 2010 (UTC)

:No problem, it's just something to watch out for. I'll integrate it with the other section.
:Dagar has been making edits to the essay as well, he's cleaned up the language in some of the sections and organized the references. Maybe he would like to tackle one of the object specific sections?
:--[[User:Mbingham|Mbingham]] 20:02, 12 October 2010 (UTC)

::I apologize for the delay, this has been an easy thing to neglect during a busy week. What's the proper way to reference with this wiki? --[[User:Dagar|Dagar]] 21:29, 13 October 2010 (UTC)

:::check out this reference guide, it explain how to reference any material you find online. [http://libweb.anglia.ac.uk/referencing/harvard.htm Harvard System of Reference] --[[User:Smcilroy|Smcilroy]] 22:46, 13 October 2010 (UTC)

I'm going to finish up the Security section if nobody tags it by the end of today. I have a draft written up. The fact that more people aren't tagging the document outline and volunteering responsibilities is kind of unnerving...

--[[User:Myagi|Myagi]] 07:57, 13 October 2010 (UTC)

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage -dagar

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security -Myagi

:*Conclusion

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-11T16:29:27Z

Smcilroy:

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)

:Sounds like a good idea. Here's a relatively quick list of topics to talk about, based on our discussions and the outline below. Add in any sections anyone thinks are missing and put your name beside areas you want:

:*Overview and history of block-based storage -Mbingham
:*Block based storage standards - SCSI, SATA, ATA/IDE etc -Mbingham
:*Networked storage architectures: SAN and NAS -Smcilroy

:*How storage needs have changed since the development of block-based storage
:(maybe focus on the Internet, massive coorporate/government networks, large personal storage, etc)

:*Overview and History of object-based storage
:*Object-based storage standards (ANSI OSD specification)
:*Object-based storage applied to networked storage

:Comparison of object and block based stores focusing on:
::*Scalability -Myagi
::*Integrity -Myagi
::*Security

:*Conclusion

:Also, it would probably add it would be useful for people to be reading over each other's work and making suggestions, etc. I would also be cool with other people adding stuff to my sections if they have additional info or if there's something i've overlooked. There's 11 or 12 sections there, and I think there's six of us, so we can start off taking maybe 2 sections each, and then if we don't have all the sections covered we can divide them up later. How does that sound?
:--[[User:Mbingham|Mbingham]] 16:45, 10 October 2010 (UTC)

:Good plan, I took Scalability and Integrity comparisons of object and block stores.
:--[[User:Myagi|Myagi]] 13:26, 10 October 2010 (UTC)

== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

Talk:COMP 3000 Essay 1 2010 Question 11

2010-10-10T15:16:39Z

Smcilroy:

== Essay Format and Assigned Tasks ==
So I added an intro and I did it like it was an essay and not a wiki article. Feel free to edit, expand and replace it as you see fit.
Also I think we should just list the topics we want to talk about and then people can put their name beside it and work on it, that way we don't have two people working on the same thing. Then we can edit it all so it fits together in the end. What do you think?
--[[User:Smcilroy|Smcilroy]] 15:16, 10 October 2010 (UTC)
== Initial Outline ==
'''Introduction'''
* Thesis Statement: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes.
* What will be discussed
- Current state of block based storage
- Brief overview of object store
- Scalability
- Integrity
- Security

'''Block based storage'''
* NAS is a single storage device that is shared on a LAN
- File level/Single storage device(s) that operates individually
- Clients connect to the NAS head (interface between client and NAS) rather than to the individual storage devices
- Use small, specialized and proprietary operating systems instead of general purpose OSs
- Can enforce security constraints, quotas, indexing
- Example of access: \\NAS\Sharename

Advantages
- Dedicated, feature-rich file sharing
- Network optimized
- Centralized storage
- Less administration overhead
Disadvantages
- Metadata processing has to be handled on the NAS server
- Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck
- Scaling by adding additional NAS devices quickly becomes a management issue because data is isolated on individual NAS islands
- High latency protocols that clogs LANs, using TCP/IP
- Not suitable for data transfer intensive apps

* SAN filesystem is a local network of multiple devices that operate on disk blocks and provides a file system abstraction
- Block level/local network of multiple device
- Every client computer has its own file system
- A SAN alone does not provide the file abstraction but there is a file system built on top of SANs
- Example of access: D:\, E:\, etc.

Advantages
- High-performance shared disk
- Scalable
- Short I/O paths
- Lots of parallelism
Disadvantages
- Harder to maintain, lots of file systems to manage
- Harder to administer, lots of storage access rights to coordinate

* OSDs closes the gap between the scalability of SAN and the file sharing capabilities of NAS
* Block storage has limitations that have become more apparent as demand for scalability and security has grown

'''Overview of OSD'''
* An OSD device deals in objects
- Handles the mapping from object to physical media locations itself
- Tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients
- OSDs are directly connected to clients without the need for an intermediary to handle metadata.

* ANSI ratified version 1.0 of the OSD specification in 2004, defining a protocol for communication with object-based storage devices
* The OSD specification describes:
- a SCSI command set that provides a high-level interface to OSD devices
- how file systems and databases stores and retrieves data objects
- work has continued in ratifying OSD-2 and OSD-3 specificiations

'''Scalability'''
* Metadata is associated and stored directly with data objects and carried between layers and across devices
* Space allocation delegated to storage device
* Server has reduced overhead and processing, allowing larger clusters of storage

'''Integrity'''
* OSD's have knowledge of its object layout
* Unlike block stores, OSD's can recover data specific to a byte range
- OSD's know what space is being unused in this way
- Can scan and correct errors without losing data
* OSD's maintain internal copies of metadata
- User doesn't have to do a complete file system restore for the sake of one or few unrecoverable files
- OSD's can identify the byte range lost and restore the file efficiently

'''Security'''
* Suited for network based storage
* Associate security attributes directly with data object
* Security requests handled directly by storage device
* Computer system can access OSD device by providing cryptographically secure credentials(capability) that the OSD device can validate
- This can prevent malicious access from unauthorized requests or accidental access from misconfigured machines

'''Conclusion'''
* Reiteration of thesis statement

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

Hey Myagi, I thought i'd move your outline to its own section at the top of the page so it's more visible. I hope you don't mind. If you do, feel free to revert this edit.

--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

: It's all good.
:--[[User:Myagi|Myagi]] 10:00, 8 October 2010 (UTC)

:This outline looks pretty good to me. I like the three focus points of scalability, integrity and security, those seem to be constant themes in what i've read about object stores.

:For the block storage overview, the two current standards for a block based interface seem to be SCSI and SATA. SCSI seems to be used more in enterprise storage and SATA more in personal storage (someone correct me if i'm wrong here). We might also want to take a look at SAN and NAS. I need to do some more reading, haha.

:Also, I think we might as well start putting up some stuff on the article page. Even just a few sentences per section. I can start on that tomorrow or maybe Saturday. Of course any one else is welcome to as well.

:--[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

== Quick Overview ==
So I hope i'm not the only one who was wondering "What are object stores?" when reading the question. I don't think the textbook mentions it but I didn't read through the filesystems chapter very thoroughly. Here's where some quick googling has got me:

Most storage devices divide their storage up into blocks, a fixed length sequence of bytes. The interface that storage devices provide to the rest of the system is pretty simple. It's essentially "Here, you can read to or write to blocks, have fun". This is block-based storage.

Object-based storage is different. The interface it presents to the rest of the system is more sophisticated. Instead of directly accessing blocks on the disk, the system accesses objects. Objects are like a level of abstraction on top of blocks. Objects can be variable sized, read/written to, created, and deleted. The device itself handles mapping these objects to blocks and all the issues that come with that, rather than the OS.

Here's some papers that give an overview of object-based storage:

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1612479 Object Storage: The Future Building Block for Storage Systems]

[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1222722 Object-Based Storage]

I think if you just look those up on google scholar you can access the pdf without even being inside carleton's network.

--[[User:Mbingham|Mbingham]] 23:56, 1 October 2010 (UTC)

== Some more links ==
I haven't been reading many academic papers on the subject so those links will be very useful.

If I may add to this. I read articles on object storage here:

[http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf Object Storage Overview]

and

[http://www.snia.org/education/tutorials/2010/spring/file/PaulMassiglia_File_Systems_Object_Storage_Devices.pdf File Systems for OSD's]

I can add that metadata is much richer in an object store context. Searching for files and grouping related files together is much easier with the context information that metadata supplies for objects. I'm beginning to read:

[http://www.seagate.com/docs/pdf/whitepaper/tp_536.pdf The advantages of OSD's]

--[[User:Myagi|Myagi]] 10:39, 5 October 2010 (UTC)

I'm going to write a version of my essay out over the long weekend with headings and references and put it up on the wiki. I'd like to know who and how many people are working on this essay but dunno if that's possible. We'll see what we do from there I guess? I was thinking we just homogenize all of the information we write into one unified essay.

--[[User:Myagi|Myagi]] 10:42, 6 October 2010 (UTC)

:I think there's 6 people in our group, though there might only be 5. I'll be working on this over the long weekend too. I was thinking maybe we should try to get a rough outline up, thursday or friday. Since Prof Somayaji mentioned that this should have the format of an essay, maybe we could start with what our main argument is?

:I was thinking something like objects stores are becoming more attractive because the demands on filesystems has changed, but the interface has not been updated to accomodate these changes. Then we could go into an explanation of block based storage, how it fails to meet the needs placed on modern FSs, then how object stores solves these problems. What do you think?

:--[[User:Mbingham|Mbingham]] 01:55, 7 October 2010 (UTC)

:You don't need to write your own independent essay on the wiki. Let's just add info as it comes along. I'll be completely without internet access this weekend, but I'll try to bring some background reading with me. Expect lots of edits from me starting Monday night/Tuesday morning.
:--[[User:Dagar|Dagar]] 12:59, 7 October 2010 (UTC)

:Sounds good! I think that's a good idea for a thesis statement and we should have a concrete one by Thurs/Fri. Although I'm not absolutely clear about the interface not being updated? I think the object store SCSI standard is constantly being ratified and now they have an OSD-3 draft. [http://www.t10.org/drafts.htm#OSD_Family T10 OSD Working Drafts]. But then again I'm probably misunderstanding something...
:--[[User:Myagi|Myagi]] 10:08, 7 October 2010 (UTC)

::I didn't mean that the object interface hadn't been updated, I meant that the block interface hasn't been updated to reflect the changing requirements put on storage. Since the block interface is still largely the same as it was decades ago (read/write to blocks) it is unable to handle the new requirements. Object stores look attractive because they are designed to deal with issues like scalability, integrity, security, etc. Sorry for the confusion, I hope it makes more sense now, haha.
::--[[User:Mbingham|Mbingham]] 15:44, 7 October 2010 (UTC)

:I gotcha, thanks for explaining! I'd say that would be a great thesis statement then: Object stores are becoming more attractive because the demands on filesystems has changed and the block store interface has not been updated to accommodate these changes. We can work from there. I think we can address the inadequacies of block based storage after stating our thesis and then for the body, we point out how object stores deal with issues of scalability, integrity, security as well as flexibility. And then some kind of nice tie up reiterating our thesis.
:--[[User:Myagi|Myagi]] 12:50, 7 October 2010 (UTC)

I mine as well put my contribution here. I'm willing to move or change it for the sake of organizing this discussion page.

--[[User:Myagi|Myagi]] 18:15, 7 October 2010 (UTC)

:(moved Myagi's outline to top of page) --[[User:Mbingham|Mbingham]] 02:31, 8 October 2010 (UTC)

Some links that I found while doing the assignment about object storage and its application to SAN systems:
http://dsc.sun.com/solaris/articles/osd.html
http://www.research.ibm.com/haifa/projects/storage/zFS/papers/amalfi.pdf

--[[User:Npradhan|Npradhan]] 23:45, 9 October 2010 (UTC)

== Other ==
-instead of storing filesytems in terms of blocks, you store in terms of objects.

-extents, named extents

-objects fancier because they can move around.

-extra level of abstraction and indirection

-files made of objects, objects made of blocks

COMP 3000 Essay 1 2010 Question 11

2010-10-10T15:05:19Z

Smcilroy: added an introduction

=Question=

Why are object stores an increasingly attractive building block for filesystems (as opposed to block-based stores)? Explain.

=Answer=
Each year we are faced with growing storage needs as the world's information increases exponentially and business' are increasingly choosing to archive and retain all the data they produce. The storage industry has been able to keep up with demand with matching increases in storage capacity. Unfortunately the interfaces between clients and storage devices has remained unchanged since the 1950's. The dominate storage mechanism is still block-based storage technology. This has been sufficient for meeting most needs of modern businesses, but as we enter an age where "store everything, forever" is the common mantra of storage administrators and unstructured data with little meta-data is the norm, we have to look for technology that can provide better scalability, business intelligence, and management while ensuring security and data access speed of traditional storage solutions.

Enter object based storage. Object storage uses objects that consists of data and meta-data that describe the object. They are accessed with defined methods such as read and write and carry a unique ID. They manage all necessary low-level storage, space management, and security functions. This storage technology has the potential to address some of the problems with block-based storage.

With increased scalability, better security through per-object level access and insured integrity of data with unique hash key's for each object along with some benefits in management and business intelligence with rich meta-data, OSD can be seen as a viable alternative to improve the standard architectures of SAN and NAS networks.
=References=