Difference between revisions of "DistOS 2015W Session 5"

From Soma-notes
Jump to navigation Jump to search
Line 89: Line 89:
It is scalable file system for large distributed data intensive applications. The design is driven by providing previous applications workloads and technical environments, both current and anticipated.  
It is scalable file system for large distributed data intensive applications. The design is driven by providing previous applications workloads and technical environments, both current and anticipated.  


The architecture of the google file system consists of a single master and multiple clients. Files are divided into fixed size chunks. Each chunk is identified by globally unique 64 bit chunk handle assigned by master at the end of the time of chunk creation. The master maintains all the file system meta data which include the name space and also the access control information.
The architecture of the Google file system consists of a single master, multiple chunk-servers and multiple clients. These chunk-servers store the data or file in unit of named chunks. Each chunk is identified by globally unique 64 bit chunk handle assigned by master at the end of the time of chunk creation. For more reliability and availability chunks are replicated  on more chunk servers.  The master maintains all the file system meta data which include the name space, chunk location and also the access control information.
 
Master and Chunk-Server Communication:
a) To check whether there is any chunk-server is down
b) To check if any file is corrupted.
c) Whether to create or delete any chunk.
 
Operation of GFS:
a) Client communicate with master to get the matadata.
b) client get chunk location from matadata.
c) Communicate with the one of that chunk-server to retrieve the data to perform operations on it.

Revision as of 23:47, 3 February 2015

Cloud Distributed Operating System

It is a distributed OS running on a set of computers that are interconnected by a group of network. It basically unifies different computers into a single component.

The OS is based on 2 patterns: 1. Message Based OS 2. Object Based OS

The structure of this is based on Object Thread Model. It has set of objects which are defined by the class. Objects respond to messages. Sending message to object causes object to execute the method and then reply back.

It has Active Objects and Passive objects

1.Active Objects are the objects which have one or more processes associated with them and further they can communicate with the external environment. 2.Passive Objects are the object which have no processes in them.

The contents of the Cloud are long lived. They exist forever and can survive system crashes and shut downs.

Another important part of Cloud DOS are threads

The threads are the logical path of execution that traverse objects and executes code in them.

Note: The cloud thread is not bound to a single address space. Several threads can enter an object simultaneously and execute concurrently.

The nature of the Cloud object prohibits a thread from accessing any data outside the current address space in which it is executing.


Interaction between Objects and Threads 1)Inter object interfaces are procedural 2)Invocations work across machine boundaries 3)Objects in cloud unify concept of persistent storage and memory to create address space, thus making the programming simpler. 4)Control flow achieved by threads invoking objects.

Cloud Environment 1) Integrates set of homogeneous machines into one seamless environment 2) There are three logical categories of machines- Compute Server, User Workstation and Data server.



Plan 9

Plan 9 is a general purpose, multiuser and mobile computing environment physically distributed across machines. The Plan 9 began in late 1980s. The aims of this system are: 1) To built a system that should be centrally administered 2) Cost effective using cheap modern microcomputers. The distribution itself is transparent to most programs. This property is made possible by 2 properties: 1) A per process group name space 2) A uniform access to all the resources by representing them as a file.

It is quite similar to the Unix yet so different. The commands, libraries and system calls are similar to that of Unix and therefore a casual user cannot distinguish between these two. The problems in UNIX were too deep to fix but still the various ideas were brought along. The problems addressed badly by UNIX were improved. Old tools were dropped and others were polished and reused.

What actually distinguishes Plan 9 is its organization.

Plan 9 is divided along the lines of service function.

  • CPU services and terminals use same kernel
  • Users may choose to run programs locally or remotely on CPU servers
  • Gives the user a choice to choose whether they want distributed or centralized.

The design of Plan 9 is based on 3 principles: 1) Resources are named and accessed like files in hierarchical file system. 2) Standard protocol 9P 3) Disjoint hierarchical provided by different services are joined together into single private hierarchal file name space.

Another concept in Plan 9 is the Virtual Name Space In a Virtual Name Space, a user boots a terminal or connects to a CPU server and then a new process group is created. Processes in group can either add to or rearrange their name space using two system calls- Mount and Bind

  • Mount is used to attach new file system to a point in name space.
  • Bind is used to attach a kernel resident file system to name space and also arrange pieces of name space.

The plan 9 provides mechanism to customize one's view of the system with the help of the software rather than the hardware. It is built for the traditional system but it can be extended to the other resources.

Parallel Programming The parallel programming has two aspects:

  • Kernel provides simple process model and carefully designed system calls for synchronization.
  • Programming language supports concurrent programming.

Implementation of Name Spaces User processes construct name specs using three system calls- mount, bind, unmount. Mount- System call attaches a tree served by a file server to the current name specs Bind-Duplicates pieces of existing name specs at another point Unmount- Allows components to be removed.


Google File System

It is scalable file system for large distributed data intensive applications. The design is driven by providing previous applications workloads and technical environments, both current and anticipated.

The architecture of the Google file system consists of a single master, multiple chunk-servers and multiple clients. These chunk-servers store the data or file in unit of named chunks. Each chunk is identified by globally unique 64 bit chunk handle assigned by master at the end of the time of chunk creation. For more reliability and availability chunks are replicated on more chunk servers. The master maintains all the file system meta data which include the name space, chunk location and also the access control information.

Master and Chunk-Server Communication: a) To check whether there is any chunk-server is down b) To check if any file is corrupted. c) Whether to create or delete any chunk.

Operation of GFS: a) Client communicate with master to get the matadata. b) client get chunk location from matadata. c) Communicate with the one of that chunk-server to retrieve the data to perform operations on it.