DistOS 2021F 2021-11-09
Jump to navigation
Jump to search
Notes
Lecture 15 ---------- - experiences - proposal - midterm update - participation Spanner - a big, distributed (semi-)relational database - very consistent - supported SQL - all of the query parts - management, maybe not so much? - big deal, because of usability - developers know SQL - want transactions, helpful for consistency across tables in distributed systems, we're always making tradeoffs between functionality, scalability, and complexity - normally we just think about functionality vs scalability (SQL vs NoSQL) - but add complexity and you can get functionality & scalability at the same time Spanner is proprietary to Google, others have made their own versions (CockroachDB) Tradeoff also shows up in Tensorflow - for "machine learning" - what is it really for? - working with n-dimensional arrays (i.e. tensors) - and we can do neural networks if we can do fast tensor processing Is this just the same thing as MapReduce? - what's different? - not embarassingly parallel! - have to communicate between tasks as they run, not just at the end (i.e., during reduce) Modern machine learning is based on large, mutable models - MANY parameters (weights in the neural network) Basic idea of a neural network - input, hidden, and output nodes - input nodes are connected to layers of hidden nodes - hidden nodes are connected to output nodes - weights on connections between nodes determine how values are transformed as they propagate along connections between nodes So here, take an input tensor, transform it a bunch of times until you get an output tensor All that "deep" learning means is it is a neural network with many, many layers of hidden nodes The cool part about tensorflow is you don't care about the hardware - your data model can be efficiently mapped onto a wide variety of architectures - big change from past efforts in supercomputing