DistOS 2015W Session 9: Difference between revisions
Ericadamski (talk | contribs) |
No edit summary |
||
Line 33: | Line 33: | ||
*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous | *A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous | ||
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency | *A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency | ||
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models. | |||
*Real Time Applications: | |||
:*Batch iterative Machine Learning: | |||
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce. |
Revision as of 00:42, 10 March 2015
BONIC
- Public Resource Computing Platform
- Gives scientists the ability to use large amounts of computation resources.
- The clients do not connect directly with each other but instead they talk to a central server located at Berkley
- The goals of Bonic are:
- 1) reduce the barriers of entry
- 2) Share resources among autonomous projects
- 3) Support diverse applications
- 4) Reward participants.
A BONIC application can be identified by a single master URL,
which serves as the homepage as well as the directory of the servers.
SETI@Home
- Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
- Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
- It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
- Uses BONIC as a backbone for the project
- Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
MapReduce
- A programming model presented by Google to do large scale parallel computations
- Uses the
Map()
andReduce()
functions from functional style programming languages
- Map (Filtering)
- Takes a function and applies it to all elements of the given data set
- Reduce (Summary)
- Accumulates results from the data set using a given function
Naiad
- A programming model similar to
MapReduce
but with streaming capabilities so that data results are almost instantaneous - A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
- Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
- Real Time Applications:
- Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.