DistOS 2014W Lecture 16

From Soma-notes

Public Resource Computing


Outline for upcoming lectures

All the papers that would be covered in upcoming lectures have been posted on Wiki. These paper will be more difficult in comparison to the papers we have covered so far, so we should be prepared to allot more time for studying these papers and come prepared in class. We may abandon the way of discussing the papers in group and instead everyone would ask the questions about what,they did not understand from paper so it would allow us to discuss the technical detail better. Professor will not be taking the next class, instead our TA would discuss the two papers on how to conduct a literature survey, which should help with our projects.


Project proposal

There were 11 proposals and out of which professor found 4 to be in the state of getting accepted and has graded them 10/10. professor has mailed to everyone with the feedback about the project proposal so that we can incorporate those comments and submit the project proposals by coming Saturday ( the extended deadline). the deadline has been extended so that every one can work out the flaws in their proposal and get the best grades (10/10). Project Presentation are to be held on 1st and 3rd april. People who got 10/10 should be ready to present on Tuesday as they are ahead and better prepared for it, there should be 6 presentation on Tuesday and rest on Thursday. Under-grad will have their final exam on 24th April. 24th April is also the date to turn-in the final project report.

Public Resource Computing

The paper assigned for readings were on SETI and BOINC. BOINC is the system SETI is built upon, there are other projects running on the same system like Folding@home etc. What is public resource computing and can we compare it to the file systems we have studied so far in this course -

In public resource system, you divide the problem in to work units and people voluntarily install the clients on their machines to have the program run to work upon the work unit assigned to their client in return for credits, people with most credits get to be showcased as major contributors.People can see the amount of resources (process cycles etc) they have devoted for the cause on the GUI of the client installed. When client produces results for the work unit it was working upon, it sends the result to the server.

Comparison with other File systems , we have covered so far -

1) Use-Cases have been turned on their head. In the files systems we have covered so far, People would want access to the files stored in a network system, here a system wants to access people'machine to utilize the processing power of their machine. 2) In other file systems it was about many clients sharing the data, here it is more about sharing the processing power. In Folding@home, system can store some of its data at client's storage but that is not the public resource computing's main focus. 3) It is nothing like systems like OceanStore where there is no centralized authority, in BOINC the master/slave relation between the centralized server and the clients installed across users' machine can still be visulaized and it is more like GFS in that sense because GFS also had a centralized metadata server. 4) Public resource systems are like BOTNETs but people install these clients with consent and there is no need for communication between the clients ( it is not peer to peer network). It could be made to communicate at peer to peer level but it would risk security as clients are not trusted in the network

Reliability - How does SETI address the questions of fault tolerance ? They use replication for reliability, work units are assigned to multiple clients and the results that are returned to server can be analyzed to find the outliers in order to detect the malicious users but that addresses the situations of fault tolerance from client perspective. SETI has a centralized server, which can go down and when it does, it uses exponential back off mechanism to push back the clients and ask them to wait before sending the result again but whenever a server comes back up many clients may try to access the server at once and may crash the server once again, this may cause the ddos manufactured by the server's own inadequacies.The Exponential backup approach is similar to the one adopted in resolving the TCP congestion. public resource computing don't need to be very highly reliable because it is used by scientists/ researchers who can bring the system back up, in case it goes down and start again. There are however few measures discussed within SETI like read-only data back-up etc . Compare this to highly reliable systems like ceph or oceanstore , which could recover the data in case of node crashes.

Skype was modeled like public resource computing network( before Microsoft took over), network would choose super nodes to act as routers, these super nodes would be the machines with higher reliability and better processing powers. After MS' takeover the supernodes have been centralized and the election of supernode part has been removed from the system. SETI is an example of "embarrassingly parallel" workload where the problem has inherent nature to lend itself to be divided into work-units and be computed in-parallel without any need to consolidate the results, it is called "embarrassingly parallel" because there is little to no efforts required to distribute the work load in parallel and you don't get much praise for doing it. one more example of "embarrassingly parallel" from the file systems that we have covered so far could be web-indexing in GFS. Any file system that we have discussed so far, which doesnt trust the clients can be modeled to work as public sharing system.