COMP 3000 Essay 1 2010 Question 12
Question
There have been multiple attempts to have operating systems use databases or database-like stores. What have been some of the major past attempts at this? What was their fate? Why? Key examples (not exhaustive): WinFS, ReiserFS, PalmOS, Newton OS, BeOS
Answer
There have many attempts at creating file systems that use database-like stores. While the idea is an interesting one, database stores are just not ready for the consumer market. Traditionally, databases are used in applications where a project focuses on accessing large amounts of data quickly and efficiently, such as banking systems, telecommunications, and web servers. A personal computer did not traditionally need as much storage, and is organized in an easy-to-navigate tree structure. However, the recent shift towards object-oriented programming styles has led to the idea of object stores, or file systems that function as databases of objects [1]. Object stores have caught on in a few niche applications, mostly scientific. For the most part, however, object stores are a long way from becoming a common base for a file system. To examine why, we can look at a few places in which object stores are used.
Palm OS
History
PalmOS was developed from Graffiti, a hand-recognition software which was believed to be doomed from the very beginning. Graffiti’s fast and accurate hand-recognition software was thought to be useless because many companies did not see a purpose for it; there were no companies that wanted to make the hardware for it. The company Palm then decided to make their own hardware with Graffiti as the OS, which became what is known as Palm OS today.
Database
Palm OS does not use a relational or XML database[2]. It has two types, record and resource database. There are two different types of database for Palm OS because they are used to store different things. The record database is used to store application data and the resource database is used to store applications, things that dont usually change at runtime. There are similarities between these two databases, they both have headers which stores information onto the database and are both stored in the storage heap. The difference between the record database and resource database is how the information is stored, called, handled and how it is named.
Record Database
A record database is a collection of records (blocks of memory). Each record can only store up to 64KB of memory. The record database has information that is unique to the record, the location of the record, a unique id, and an attribute which contains delete, dirty, busy, and secret bit [3].
Resource Database
One of the resources in this database contains code, another resource contains the application’s name, and another the application’s icon, and the rest contains the forms, alerts, menus, strings, and other elements of the application [4]. Resource database stores an ID number and a type (four-character constant). Resources are called by using the ID number and the type.
Problems
Record and resource database are stored in storage/database heap which is stored in the RAM (Random Access Memory). Database heap has a limit of 64KB of memory and a record had to be small enough to fit in there, this made memory hard to manage. A problem arose in versions before OS 3.0, because database heap was divided into smaller parts this allowed for free space in memory but some of those heaps were occupied by data, and there might not be enough space to store the record into the heap.
Ex: A record is 50kb in size and there are 3 database heaps which can only hold 64KB each, a total of 192KB of memory. Let's say these three database heaps are half full therefore there is 96KB of free memory, but the record of size 50kb cannot be stored because not a single heap has enough memory to store the record.
Fate
In Palm OS 3.0, their solution to the problem was to create a single large storage heap instead of multiple smaller heaps. Palm OS has changed its name to Garnet OS, since the databases were such a success it is still being used today.
**Will need to double check the info**
WinFS
History Pre-WinFS
The history of WinFS is relatively long, as we can find trace of the project back in mid-90's with Storage +. Microsoft had the idea to remove the NTFS file system and instead use a relational object-oriented file storage, which was based on SQL server 8.0. This new file system was supposed to be implemented in Windows 2003 Server. But then in 2000, Windows announced that Storage + was to be forgotten, with Relational File System (RFS) as its successor. RFS was supposed to be included in SQL Server 2000, but never made the cut. Another reason for the continual delays was that, in 2000, Oracle announced a new file system which was also a relational file system, Independent Internet File System. Microsoft had to rethink RFS to be ahead of the competition and that added additional delay.[5]
RFS was forgotten about until the public heard in 2002 about a new file system that would be present in Longhorn (later renamed Vista). The system would be once again be based on its predecessor, RFS, but it would have run on top of a NTFS file system. WinFS was included in a few public builds for Vista, but in 2004 it was removed from the beta builds. It was said it would be downloadable again after the release, but was definitely cut for good in 2006 from Vista.
Brief concept of WinFS
With today’s data, we are facing a crisis of finding what we want when we want at a reasonable speed on our own computer. The amount of file extension we can find is astonishing. We can’t name half of them nor know what they do. We store data in many different ways simple file with different extension, in different kinds of database which make finding, relating and acting quite difficult to achieve. [6]
Microsoft had an idea to solve this problem, by using a relational database as file System, where data would be simply data. To understand how WinFS file system works, we must have a general idea of the relational database. The data in this kind of database is spread into specific tables, like in normal database, but there are multiple relations between tables. This gives the programmer the power to search, find and present the result in an efficient way. In the case of WinFS, the main goal was to " Enable people to Find, Relate, and Act on their information." [7] As already pointed out, data files are so broad nowadays, with that many file formats using complicated data storage methods our " current file system does not know how to collect and find information within these new types of data." But with data treated as data inside a database, we could find what we are looking for quite easily.
Another important point of WinFS is the notion of how data relate to each other. In our current file system we can't, unless doing it manually, add a picture of our good friend Bob, and in the same time find all the picture related to Bob. On top of that, we can't find the picture of Bob, all the received emails, documents, movies and whatever else we would want find in the same request. We have to search for them one by one. If we treat data as data, once again, we can simple search in the tables of the database the pictures, received emails, videos, documents for the name Bob, and present them to the user. As we can see, data can be related to each other with key word, in our example it was the name Bob, but it can be anything. What if data could do action that follow specific rules? This is exactly what WinFS wanted to implement. " WinFS Rules are a built-in component of the system that allows you to tell the system how to work with, sort, and deliver your data". Also, it could make use of other applications on the system. So if we received a picture from our friend Bob, we could automatically transfer it elsewhere. [8]
The last point that WinFS was aiming for, was to run WinFS on top of NTFS. Basically, WinFS would scan all the data in NTFS file system, and put it into its database. Thus it would work as a file system, but it would be totally dependent of NTFS.
Fate of WinFS
As we have seen in the history of WinFS, its fate wasn't as desired. In 2006, on the team blog, Microsoft announced that it wouldn't include WinFS as a system file package but instead deliver it into the next MS SQL server, which was SQL server 2008. "These changes do mean that we are not pursuing a separate delivery of WinFS, including the previously planned Beta 2 release."[9] but instead, they will keep working on it and some "may be used by other Microsoft products going forward."[10] So we can see that it didn't become a file system, but some of the logistic will go into Microsoft’s database software.
Why
Microsoft never released publicly what exactly went wrong. There was much speculation about the design of WinFS, but on the team blog Quentin Clark, Product Unit Manager of WinFS, answer that " No. In fact, the Beta was coming together really well. "[11] He then replied that the technology used wasn't easy to build on, so they had to rewrite some parts, but that wouldn't have caused the end of it. Some others also speculate that no serious software used it, nor did it receive the attention needed from the developer to have a good start. In an interview with Channel 9, Quentin Clark said "We were building too much of the house at once. We had guys working on the roof while we were still pouring concrete for the foundation."[12] This show that the team might had some management problem, which lead to the termination of the project.
BeOS
Working on it...
Newton OS
Brief History of Newton OS
Newton OS was created by Apple and was used with their line of PDAs, becoming one of the world's first PDAs. Newton was originally meant to be an innovative OS to reinvent personal computing, but it was changed to become a PDA, due to fear of eating up Macintosh sales and because of project delays [13]. The first iteration of the Newton system came as the MessagePad. The MessagePad sold out its 5,000 copies within a few hours of release, despite its price of $800 US [14]. Apple released several different PDAs with Newton OS between 1993 and 1997, which despite its popularity were plagued by flaws in the applications of the device [15].
The main reason why Newton PDAs became so well-known was because the Newton OS was the most advanced operating system of any personal computing device of its time. The OS kept the user from accessing the inner workings of the device, keeping users away from creating problems by tampering with the wrong settings, and used a database-based file management that simplified the system to a higher degree than any previous OS [16].
Flaws of Newton OS
Even with the innovation that came from Newton OS and its computing devices, it had several deep-running flaws that caused critics to pan the Newton devices. First and foremost, the Newton devices were known for their failure at implementing their handwriting system, which was supposed to recognize entire words. The Simpsons television show even had a joke about this flaw in one of their episodes [17]. The main problem behind the handwriting system was that it had a hard time recognizing cursive writing, even though Apple insisted that its engineers ensure that it worked correctly [18]. Another main problem with the Newton devices was that their overall size was too large for most pockets. Since Newton devices were expected to be carried similarly like a wallet or a cellphone, most people found their size too great for daily use [19][20]. Earlier Newton devices were also found to be very slow due to the virtualization of the NewtonScript and the lack of necessary RAM [21]. In the end, the Newton devices were fated to fail.
Fate
Although the Newton OS was revolutionary when it was released, it was doomed to fail due mostly to impracticality. In early 1997, Newton Inc. was created as a subsidiary company of Apple [22], but after a relatively short run, Newton Inc. was reabsorbed into Apple [23]. In December of 1997, Apple effectively ceased development efforts on the Newton OS [24], and in February 1998, Apple announced the discontinuation of the Newton OS development [25].
Why
After his return as CEO of Apple, Steve Jobs canceled or restructured many of Apple's failing products to refocus Apple's energy towards more successful endeavors, like the iPod and Macintosh computers. Not surprisingly, Newton OS was one of these products [26]. Although Newton was canceled, the development of the OS and its devices have had impacts on other Apple products. Mac OS X 10.2 has a handwriting recognition software called Inkwell that uses an external tablet to recognize words, but like the Newton handwriting software, it requires you to right each letter individually [27][28]. Pixo, the company that created the operating system for the iPod, was founded by two of the developers that worked on the Newton OS [29][30]. Pixo was subsequently acquired by Apple after the shipping of the first iPods [31].
ReiserFS
History
ReiserFS is the product of Namesys, a Californian company stared by Hans Reiser [32]. ReiserFS was first introduced in version 2.4.1 of the Linux kernel, and is the default file system on several Linux distributions, most notably SUSE for a short while. ResierFS's successor is Reiser4, which released in 2004[33]. However, in 2008, Namesys dissolved and commercial production of ResierFS and Reiser4 stopped being commercially supported [34]. Since the removal of commercial support, ReiserFS has become less popular, probably due to the many bugs that have no hope of being resolved.
Concept
ReiserFS is driven by a few major concepts. The first and probably main idea is the ability to handle many files that are smaller than a block of storage on the storage media. This is accomplished through the use of B+ Trees. [35]. Instead of balancing in order to keep the height of the tree fairly stable (as an AVL tree balances), ReiserFS balances so that the height of the storage tree is constant. This type of balanced tree reduces overhead by reducing the number of internal nodes, or i-nodes needed. Each node in the B+ tree maps to a block on the storage device, and each node can hold multiple objects, each of which has a unique key[36]
Another concept in ReiserFS is the use of unified name spaces[37]. Unified name spaces are simply a more refined definition of an object store, in which all stored data is made up of objects that are both 'files' and 'directories'. Objects are stored in such a way so that you can use directories to quickly access different types of objects, then further traverse the directories in order to access specific objects, and go even further to access an object's attributes. This maps extremely well to different programming styles, especially object oriented programming. Lookup of data can also done using attributes of the data, much like in WinFS[38]. This type of access would be more often used by the end user.
Flaws and Fate
Reference
[1] Palm OS programming: the Developer’s guide by Neil Rhodes, Julie McKeehan