Notes
Lecture 15
----------
- experiences
- proposal
- midterm update
- participation
Spanner
- a big, distributed (semi-)relational database
- very consistent
- supported SQL
- all of the query parts
- management, maybe not so much?
- big deal, because of usability
- developers know SQL
- want transactions, helpful for consistency
across tables
in distributed systems, we're always making tradeoffs
between functionality, scalability, and complexity
- normally we just think about functionality vs scalability
(SQL vs NoSQL)
- but add complexity and you can get functionality & scalability at the same time
Spanner is proprietary to Google, others have
made their own versions (CockroachDB)
Tradeoff also shows up in Tensorflow
- for "machine learning"
- what is it really for?
- working with n-dimensional arrays (i.e. tensors)
- and we can do neural networks if we can do fast
tensor processing
Is this just the same thing as MapReduce?
- what's different?
- not embarassingly parallel!
- have to communicate between tasks as they run,
not just at the end (i.e., during reduce)
Modern machine learning is based on large, mutable models
- MANY parameters (weights in the neural network)
Basic idea of a neural network
- input, hidden, and output nodes
- input nodes are connected to layers of hidden nodes
- hidden nodes are connected to output nodes
- weights on connections between nodes determine
how values are transformed as they propagate along connections between nodes
So here, take an input tensor, transform it a bunch of times until you get an output tensor
All that "deep" learning means is it is a neural network
with many, many layers of hidden nodes
The cool part about tensorflow is you don't care about the hardware
- your data model can be efficiently mapped onto a wide
variety of architectures
- big change from past efforts in supercomputing