COMP 3000 Essay 1 2010 Question 12
Question
There have been multiple attempts to have operating systems use databases or database-like stores. What have been some of the major past attempts at this? What was their fate? Why? Key examples (not exhaustive): WinFS, ReiserFS, PalmOS, Newton OS, BeOS
Answer
There have been many attempts at creating file systems that use database-like stores. While the idea is an interesting one, database stores are just not ready for the consumer market.
Traditionally, databases are used in applications where a project focuses on accessing large amounts of data quickly and efficiently, such as banking systems, telecommunications, and web servers. A personal computer did not traditionally need as much storage, and is organized in an easy-to-navigate tree structure. However, the recent shift towards object-oriented programming styles, along with the tremendous increase in the amount of data that can be stored on a home computer, has led to the idea of object stores, or file systems that function as databases of objects [1].
There have been various implementations of object stores in the real world, all chosen for certain features that made sense in that application. For PalmOS and NewtonOS, the security and modularity object stores provided was an attractive fit for mobile devices. Windows was drawn to object stores because of the possible speed increases, and so they developed WinFS. ReiserFS seemed to have started as a proof-of-concept, which found it's place with Linux.
Object stores have caught on in a few niche applications, mostly scientific. However, this technology is in its infancy, still trying to engineer issues that are arising while implementing the concept. Combined with issues with management that surround most attempts at creating an object store, it is clear that object stores are a long way off from becoming a common file system.
**For the last two paragraphs did you mean database-like stores?**
What about something like this ?
There have been many attempts at creating file systems that use database-like stores. While the idea is an interesting one, database stores are just not ready for the consumer market. Traditionally, databases are used in applications where a project focuses on accessing large amounts of data quickly and efficiently, such as banking systems, telecommunications, web servers and more. A personal computer did not traditionally need as much storage, and is organized in an easy-to-navigate tree structure. However, the recent shift towards object-oriented programming styles, along with the tremendous increase in the amount of data that can be stored on a home computer, has led to the idea of object stores, and file systems that function as databases.
There have been various implementations of operating system that use a database as file system , all chosen for certain features that made sense in that application. We’ll visit 5 differents operating systems that used a database as file system. First palmOS used database mainly for efficiently. WinFS was a file system for the operating system Windows that used database for efficiently too. NewtonOS was produced to be used on a desktop machine, but then became one of the best operating system for pda with a very good security. BeOS was a new operating system and decided to use a database for file system because of efficiently and for compability reason. ReiserFS was more of an experience that turned into a new file system using database. We will first examine the trials of PalmOS.
Palm OS
History
PalmOS was developed from Graffiti, a hand-recognition software which was believed to be doomed from the very beginning[2]. Graffiti’s fast and accurate hand-recognition software was thought to be useless because many companies did not see a purpose for it; there were no companies that wanted to make the hardware for it[3]. The company Palm then decided to make their own hardware with Graffiti as the OS, which became what is known as Palm OS today.
Database Structure
Palm OS does not use a relational or XML database[4]. Instead Palm OS has two different database types, record and resource database. There are similarities between these two databases, they both have headers which stores information onto the database and are both stored in the storage heap. The difference between the record database and resource database is how the information is stored, called, handled and how it is named. The record database is used to store data, and the resource database contains all applications.
Furthermore a record database is a collection of records (blocks of memory) each record can only store up to 64KB of memory. It has information that is unique to the record; it has the location of the record, an ID, and an attribute, which contains delete, dirty, busy, and secret bits [book ref].
The resource database stores application code, name, icon, forms, alerts, menus, strings, and all other elements of the application [book ref]. Applications can be referenced by an ID number and a type (four-character constant).
Problem
Record and resource databases are stored in storage/database heap, which in turn is stored in the RAM (Random Access Memory). The database heap has a limit of 64KB of memory and since a record had to be small enough to fit within the heap, this made memory hard to manage. For example, when a record is 50KB and there are three heaps which can only hold 64KB each, a total of 192KB of memory; say these three heaps are half full therefore there is 96KB of free memory, but the record of size 50KB cannot be stored because not a single heap has enough memory to store the record. [book ref] This problem arose in versions before OS 3.0 because the database heap was divided into smaller parts. This allowed for free space in memory, but some of those heaps were occupied by data, and there might not be enough space to store the record into the heap. Palm OS 3.0 and later solved this problem by to making a single large storage heap instead of multiple smaller heaps.
Fate
In 2005, Palm OS was acquired by Access Systems Americas, Inc. Access decided to expand the breadth of the operating system’s offering to include more tools, a better user experience, and increased compatibility across many more devices.[5] In 2007, development of Palm OS shifted to a new operating system called Garnet OS. [6] Its latest release, Garnet version 6, still includes a database based file system.
WinFS
History
The history of WinFS is relatively long, as we can find trace of the project back in mid-90's with Storage +. Microsoft had the idea to remove the NTFS file system and instead use a relational object-oriented file storage, which was based on SQL server 8.0. This new file system was supposed to be implemented in Windows 2003 Server. But then in 2000, Windows announced that Storage + was to be forgotten, with Relational File System (RFS) as its successor. RFS was supposed to be included in SQL Server 2000, but never made the cut. Another reason for the continual delays was that, in 2000, Oracle announced a new file system which was also a relational file system, Independent Internet File System. Microsoft had to rethink RFS to be ahead of the competition and that added additional delay.[7]
RFS was forgotten about until the public heard in 2002 about a new file system that would be present in Longhorn (later renamed Vista). The system would be once again be based on its predecessor, RFS, but it would have run on top of a NTFS file system. WinFS was included in a few public builds for Vista, but in 2004 it was removed from the beta builds. It was said it would be downloadable again after the release, but was definitely cut for good in 2006 from Vista.
Brief Concept
With today’s data, we are facing a crisis of finding what we want, when we want, at a reasonable speed on our own computer. Data is stored in many different ways. We can recognize the uses of certain files by their file extensions, but the amount of file extensions that exist is astonishing, so it is quite difficult to remember each one. Also, a simple file may be stored with different extensions, in different databases, which makes finding, relating to, and acting on the file quite difficult to achieve. [8]
Microsoft had an idea to solve this problem by using a relational database as a file system, where data would be treated like it should be, as data and nothing else. To understand how WinFS works, we must have a general idea of the relational database. The data in this kind of database is spread into specific tables, like in a normal database, but there are multiple relations between these tables. This gives the programmer the power to search, find and present the result in an efficient way. In the case of WinFS, the main goal was to "Enable people to Find, Relate, and Act on their information"[9].
As already pointed out, data files are very broad nowadays. With that many file formats using complicated data storage methods, the "...current file system does not know how to collect and find information within these new types of data"[10], but with data treated as data inside a database, we could find what we are looking for quite easily.
Another important point of WinFS is the notion of how data relates to each other. In our current file system we can't, unless doing it manually, add a picture of our good friend Bob, and in the same time see all the picture related to Bob. On top of that, we can't find the picture of Bob, all the received emails, documents, movies, or whatever else we would want to find in the same request. In order to find them, we have to search for them one by one. If we treat data as data, once again, we can simply search in the tables of the database for the pictures, received emails, videos, and documents containing the name Bob, and present them to the user. As we can see, data can be related to each other with keywords, in our example it was the name Bob, but it can be anything.[11]
The last problem that WinFS was aiming to overcome was to run on top of NTFS. Basically, WinFS would scan all the data in the NTFS file system, and put it into its database. Thus, it would work as a file system, but it would be totally dependent of NTFS.
Fate of WinFS
As we have seen in the history of WinFS, its fate wasn't as desired. In 2006, on the team blog, Microsoft announced that it wouldn't include WinFS as a system file package, but instead deliver it into the next Microsoft SQL server, which was SQL Server 2008. "These changes do mean that we are not pursuing a separate delivery of WinFS, including the previously planned Beta 2 release"[12], but instead, they will keep working on it and some "...may be used by other Microsoft products going forward."[13] So, we can see that WinFS didn't succeed as a file system, but some of the logistics will continue to live on in Microsoft’s database software.
Why
Microsoft never released publicly what exactly went wrong. There was much speculation about the design of WinFS, but on the team blog Quentin Clark, Product Unit Manager of WinFS, answered that "[i]n fact, the Beta was coming together really well."[14] He then replied that the technology used wasn't easy to build onto, so they had to rewrite some parts, but that this wouldn't have caused the end of WinFS. Some also speculate that no serious software used WinFS, nor did it receive the attention needed from the developers to have a good start. In an interview with Channel 9, Quentin Clark said, "We were building too much of the house at once. We had guys working on the roof while we were still pouring concrete for the foundation."[15] This shows that the team might have had some management problem, which evidently led to the termination of the project.
Newton OS
Brief History of Newton OS
Newton OS was created by Apple and was used with their line of PDAs, becoming one of the world's first PDAs. Newton was originally meant to be an innovative OS to reinvent personal computing, but it was changed to become a PDA, due to fear of eating up Macintosh sales and because of project delays [16]. The first iteration of the Newton system came as the MessagePad. The MessagePad sold out its 5,000 copies within a few hours of release, despite its price of $800 US [17]. Apple released several different PDAs with Newton OS between 1993 and 1997, which despite its popularity were plagued by flaws in the applications of the device [18].
The main reason why Newton PDAs became so well-known was because the Newton OS was the most advanced operating system of any personal computing device of its time. The OS kept the user from accessing the inner workings of the device, keeping users away from creating problems by tampering with the wrong settings, and used a database-based file management that simplified the system to a higher degree than any previous OS [19].
Flaws of Newton OS
Even with the innovation that came from Newton OS and its computing devices, it had several deep-running flaws that caused critics to pan the Newton devices. First and foremost, the Newton devices were known for their failure at implementing their handwriting system, which was supposed to recognize entire words. The Simpsons television show even had a joke about this flaw in one of their episodes [20]. The main problem behind the handwriting system was that it had a hard time recognizing cursive writing, even though Apple insisted that its engineers ensured it worked correctly [21]. Another main problem with the Newton devices was that their overall size was too large for most pockets. Since Newton devices were expected to be carried similarly like a wallet or a cellphone, most people found their size too great for daily use [22][23]. Earlier Newton devices were also found to be very slow due to the virtualization of the NewtonScript and the lack of necessary RAM [24]. In the end, the Newton devices were fated to fail.
**I dont think the sentence with the simpsons part is a good idea because it is not likely for essays to have something like that**
Fate
Although the Newton OS was revolutionary when it was released, it was doomed to fail due mostly to impracticality. In early 1997, Newton Inc. was created as a subsidiary company of Apple [25], but after a relatively short run, Newton Inc. was reabsorbed into Apple [26]. In December of 1997, Apple effectively ceased development efforts on the Newton OS [27], and in February 1998, Apple announced the discontinuation of the Newton OS development [28].
Why
After his return as CEO of Apple, Steve Jobs canceled or restructured many of Apple's failing products to refocus Apple's energy towards more successful endeavors, like the iPod and Macintosh computers. Not surprisingly, Newton OS was one of these products [29]. Although Newton was canceled, the development of the OS and its devices have had impacts on other Apple products. Mac OS X 10.2 has a handwriting recognition software called Inkwell that uses an external tablet to recognize words, but like the Newton handwriting software, it requires you to right each letter individually [30][31]. Pixo, the company that created the operating system for the iPod, was founded by two of the developers that worked on the Newton OS [32][33]. Pixo was subsequently acquired by Apple after the shipping of the first iPods [34].
BeOS
Brief History
BeOS is released by Be, Inc which is founded in 1990 by Jean-louis Gasee and Steve Sakoman. Be OS is a brand new operating system running on it’s own hardware called BeBox. The system is giving a better performance on digital media works because of better use on system resources. This system is initially made for AT&T Hobbit hardware, but since AT&T decide to stop producing Hobbit processor, the system is modified to Apple’s PowerPC hardware.
Operating System
File System
The file system used by BeOS is called BFS. BFS is a custom file system developed by the same team as BeOs. BFS is a 64-bit journaling file system which logs all data before storing it. [35] This journaling increases both the stability and security of BeOs, by preventing system crashes from jeopardizing the file system’s integrity as well as offering quicker recovery from crashes. However, this comes at the expensive of rapid data acess since a journaling file system is based on random memory access. Uniquely, the BFS journaling file system was implemented by BFS to be a high performance file system that stores all data in a database, but also allows users to view it in a more traditional heirachical way. [36] It also boost flexibility by allowing users to store any possible data format of their choosing.
Since BFS is a custom file system developed solely to be used by BeOs its success is intrinsically tied to that of its operating system. BeOs poses a threat to BFS both because of its inability to reach the masses as well as the potential memory errors caused by its unfiltered multithreading capability. Because, system reliability is placed in the hands of the programmer, BeOs's multithreading capability is a potential cause for concern especially when combined with a slow access file system.
Fate
The fate of BeOS
Microsoft execlusive lisence with hardware manufacture made harder to enter the market
ReiserFS
History
ReiserFS is the product of Namesys, a Californian company stared by Hans Reiser [37]. ReiserFS was first introduced in version 2.4.1 of the Linux kernel, and was the default file system on several Linux distributions, most notably SUSE. ReiserFS's successor is Reiser4, which released in 2004[38]. However, in 2008, Namesys dissolved and commercial production of ResierFS and Reiser4 halted [39]. Since the removal of commercial support, ReiserFS has become less popular, probably due to the many bugs that have no hope of being resolved.
Concept
ReiserFS is driven by a few major concepts. The first and probably main idea is the ability to handle many files that are smaller than a block of storage on the storage media. This is accomplished through the use of B+ Trees. [40]. Instead of balancing in order to keep the height of the tree fairly stable (as an AVL tree balances), ReiserFS balances so that the height of the storage tree is constant. This type of balanced tree reduces overhead by reducing the number of internal nodes, or i-nodes needed. Each node in the B+ tree maps to a block on the storage device, and each node can hold multiple objects, each of which has a unique key[41]
Another concept in ReiserFS is the use of unified name spaces[42]. Unified name spaces are simply a more refined definition of an object store, in which all stored data is made up of objects that are both 'files' and 'directories'. Objects are stored in such a way so that you can use directories to quickly access different types of objects, then further traverse the directories in order to access specific objects, and go even further to access an object's attributes. This maps extremely well to different programming styles, especially object oriented programming. Lookup of data can also done using attributes of the data, much like in WinFS[43]. This type of access would be more often used by the end user.
Flaws and Fate
While ResierFS preforms very well in terms of speed [44], ReiserFS has many bugs that can lead to instability of files. One problem is that link() and unlink() are not synchronous in ReiserFS, which can lead to data corruption[45]. Another is that, if ReiserFS's tree structure becomes unusable for any reason, a rebuild risks further corrupting the file system [46]. There are also issues with the journaling which can lead to corruption [47]. Issues like these are a bad idea in any file system, as the risk of unstable data is a serious drawback. Considering that the root of these issues are often synchronization problems, ReiserFS is a very poor choice for many modern, multi-core systems, as the increased need for synchronization would increase the level of instability.
In addition to the stability issues, ReiserFS has some design issues mainly contribute to why it is no longer used today. Since ReiserFS was designed to handle numerators small files, when trying to scale a ReiserFS system, behavior is inconsistent, again spawning from issues in synchronization. In addition, moving from the bug-ridden ReiserFS v3 to v4 required a reformat, as Reiser4 was re-written from scratch. This prompted SUSE to drop ReiserFS as it's default file system, as they preferred to use a tried and tested sytem update (ext3) instead of a new, yet-untested file system in their distribution [48].
Interestingly enough, the SUSE drop coincided eerily well with Hans Reiser being sent to jail. Left without an owner, Namesys lost most corporate sponsors, eventually only receiving funding from the DARPA initiative. With Namesys (mostly) gone, ReiserFS no longer has commercial support, and so the stability and scalability bugs are taking a fair amount of time to be resolved. This combined with low visibility that stems from no longer being used in any major Linux distributions largely explains why ReiserFS can be considered a dead project.
Conclusion
Looking at these object based file systems, it is easy to see that, despite object stores being an interesting and potentially useful idea, they are just not ready yet for the desktop. For certain applications, such as systems designed to handle large amounts of data for programming projects, such as ones used for models in physics; or in small, modular systems such as PalmOS and NewtonOS, object stores are a great alternative to the traditional file system. However, there has not yet been developed a highly flexible design that would be useful on the average PC, as we can see from the failure of WinFS and ReiserFS.
The increasing importance of digital information in today's world also means that people are extremely reluctant to switch a tried and true system for newer and possibly unstable technology. In the case of object stores, what few implementations are available to the public for the desktop are unstable, and therefore an unattractive choice.
It is important to note that object stores are not the only alternative file system competing to improve upon traditional block storage[49], so there is no guarantee that object stores will ever become a widely used technology, as they may not mature quickly enough to beat out other technologies.
Reference
[1] Palm OS programming: the Developer’s guide by Neil Rhodes, Julie McKeehan