Difference between revisions of "DistOS 2015W Session 9"

From Soma-notes
Jump to navigation Jump to search
Line 33: Line 33:
*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.

Revision as of 20:42, 9 March 2015

BONIC

  • Public Resource Computing Platform
  • Gives scientists the ability to use large amounts of computation resources.
  • The clients do not connect directly with each other but instead they talk to a central server located at Berkley
  • The goals of Bonic are:
  • 1) reduce the barriers of entry
  • 2) Share resources among autonomous projects
  • 3) Support diverse applications
  • 4) Reward participants.
A BONIC application can be identified by a single master URL, 
which serves as the homepage as well as the directory of the servers.

SETI@Home

  • Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
  • Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
  • It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
  • Uses BONIC as a backbone for the project
  • Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients

MapReduce

  • A programming model presented by Google to do large scale parallel computations
  • Uses the Map() and Reduce() functions from functional style programming languages
  • Map (Filtering)
  • Takes a function and applies it to all elements of the given data set
  • Reduce (Summary)
  • Accumulates results from the data set using a given function

Naiad

  • A programming model similar to MapReduce but with streaming capabilities so that data results are almost instantaneous
  • A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
  • Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
  • Real Time Applications:
  • Batch iterative Machine Learning:

VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.