DistOS 2018F 2018-11-28
Readings
"Serverless Computing"
- Wikipedia article on Serverless Computing
- Wikipedia article on Google App Engine
- Wikipedia article on AWS Lambda
- Google App Engine
- AWS Lambda Developer Guide
- serverless documentation
Notes
The dream: to put app on the cloud and have it scale up automatically without having to worry about implementation.
This is not similar to UNIX at all, it is a very restricted run-time with specific symantics. Kinda the dream and kinda not, what is it all in service of? The web.
In the marketing, google app engine and aws lambda and google function. Google app engine came first but aws lambda is more functional but they don’t look that different. Google app engine more for web apps and aws lambda is supposed to handle functions but they are all restful so they look like web clients. Send events and steams of data so more then web but using web interfaces. The IO is the web and that is the key thing, the file system has been replaced by web urls and rpcs have been replaced by web requests and how it is all tied together because it is kinda stateless. HTTP was a stateless protocol. That is how all of this is built and how the web won, simple and stateless...only had to add in state as needed to so scalable yet functional. Can hack on functionality but not scalability otherwise will fail badly. What is the architecture. What is the programming model? A language run-time environment, easy to handle incoming requests...a function that will be called when the request is made and routed through who knows how many machines and if need to access some sort of state need to access a database using standard libraries...so manipulate state and then code starts running on incoming request. That is it, that is your model. AWS lambda and Google app engine, code does not run if nothing to do, purely event driven. Bind to influence and run function when get requests....don’t control the part of the system that runs in response to requests...don’t control that part of the system. All you control is what you do in response to the requests. Other languages are now supported but define function that responds to events. There is no main loop, running in response to events. Use a data-store like SQL and standard routines to access it. Upload a file to system, store as a blob in the database. Not a hierarchical key-value store. More of a database than a file system. That is the programming model, a model for a web application. When data becomes available, get a call and code can retrieve it and go process it and store results in a database. Does this model make sense for a long running batch job? No this is more for IO heavy or dynamic website and other business logic for an application. Most of the stuff we do now do not require a lot of cycles.
Closest we have to ideal of a platform to build distributed applications. Nothing academically about this. How are they running something like NodeJS on the back-end? Containers (share hardware) and VM (to allocate the hardware) and orchestration. Spin up to service requests until loaded and then spin up another one etc...just the web, same technology that has been in development for the last 20 years. That is the weird thing, nobody designed this...this is stuff that was lying around and then decided to build it by putting the pieces together...api’, isolation etc. So it a complete mess. Kubernetes as a technology is very complicated and overkill for most organizations. So some folks are dealing with the pain of setting it up and then they sell it. don’t want to manage thousands of machines when only need one...then when need more, someone else can take care of that. But environment providing is a web server talking to a database. How they do it? We have talked about it. All stateless between these services, have something else you want to do? Make it look like a web application....log event and returns a response but instead of to the app that generated the event, store in a database or generate an alert.
That wasn’t the goal or plan, no sane comp sci would have built it this way but when put it together, it does work. And looking at pieces individually it is understandable, put it all together, it is crazy. Is this propitiatory? What is offered is different in details but fundamentally the same. All the languages are open source and all the APIs are well known and then people building copies with their own stuff on top. Tie yourself to API services, might be some lock-in but if open source, someone else may have used the same foundation and then built more stuff on top and can move code to them.
Legal framework.....with encryption can make things private.
Web assembly: JavaScript, language of the web then did Node.JS....this sucks, want to run all langauges with sandbox model and machine code. Web Assembly, assembly code for the web. The target for a compiler that will run within a sandbox and can integrate with the DOM and JavaScript and can share back and forth running at native speed but isolated. How everything is going to run period. Service, client everywhere. B/c we want sand-boxing in the browser and server since it makes containers better. Better then the stupid stuff being done with Linux with the POSIX API to isolate things. POSIX is going away, will be there under the surface and for hosting but, we will not be talking about it. Like GO as a language, GO is all statically linked. Have a GO binary and it is just a self-contained thing. GO Binary is self-contained and statically linked since they are thinking like containers...reliability and reproducible so containerized from the get go. A container contains one binary, really simple with no dependency. Just linux kernel binary interface....and web assembly will likely merge in the future...what will be preserved is web assembly b/c don’t need POSIX, mostly gets in the way. We just need to support the web technologies. Which is weird since UNIX was successful but, that will push it one it’s way out...pushed far down so hard to see it.
Wrapping of state so that it can just move around. Container might be downloaded to laptop or phone to run, bundles of state but a cashed version of what is elsewhere. Why store permanent state on your own device. Copy might be on server or cloud, does not matter as long as it is there.
Management of state is what is painful and so it is getting abstracted and pushed away so state management is becoming some sort of service. So this is fascinating and complexity underneath it.
We find ways to punish when trust is violated but we need to trust otherwise have a lot of waist and a lot of trouble. Crypto-currencies is trying not to trust but shifted trust elsewhere and now lots of problem.
Test:
Question 2: f4 encrypted files and called it untrusted. No it’s not, it is completely trusted b/c they have all the keys and they have all the data. Just makes it easier to delete. AFS did not trust clients and OceanStore, Farsight and BOINC has the largest level of distrust. Otherwise does trust everything.