DistOS 2023W 2023-01-30: Difference between revisions
Created page with "==Discussion Questions== * Why do RPCs require data to be copied? * Why is transparency important? Do you think it is still important? * How does SUN's RPC compare to that described by Nelson? * Why did SUN make its own RPC mechanism? * What are alternatives to RPC? * Do you think RPCs are good for security? Why or why not?" |
No edit summary |
||
Line 7: | Line 7: | ||
* What are alternatives to RPC? | * What are alternatives to RPC? | ||
* Do you think RPCs are good for security? Why or why not? | * Do you think RPCs are good for security? Why or why not? | ||
==Notes== | |||
<pre> | |||
Lecture 7 | |||
--------- | |||
Why do RPCs require data to be copied? | |||
Why is transparency important? Do you think it is still important? | |||
How does SUN's RPC compare to that described by Nelson? | |||
Why did SUN make its own RPC mechanism? | |||
What are alternatives to RPC? | |||
Do you think RPCs are good for security? Why or why not? | |||
Where did Nelson work? Xerox PARC! | |||
- was involved with Smalltalk, etc | |||
What is the main alternative to RPC? | |||
- message passing | |||
- because that's how networks work | |||
What is RPC inspired by? | |||
- procedure calls in ALGOL-like languages | |||
So what is a procedure call? | |||
- gather arguments | |||
- transfer control | |||
- receive return values | |||
Note that when you call a procedure (function), the caller doesn't do anything | |||
until the call finishes | |||
- turning something that is inherently concurrent into something that is sequential | |||
Why do procedure calls make sense locally? | |||
- code for a single-threaded process is running on only one core | |||
(used to be, there was only one core total) | |||
- so there's no problem with the caller waiting while the callee runs, because | |||
that's how you share the CPU (they take turns) | |||
(natural way of multiplexing the CPU between different parts of a program) | |||
- also, caller and callee share address space, instruction set, data formats etc, so easy to share arguments/pass around data | |||
- names for code and data are obvious (memory addresses) | |||
So when we move to a network, do the assumptions behind local procedure calls hold? | |||
- not really | |||
- note that there is always at least two CPUs, and the caller CPU has nothing to do while the callee CPU is working (other than do other tasks) | |||
- and no shared memory, so you have to copy data between systems that may be | |||
fundamentally different | |||
So why do RPC? | |||
- programmer convenience/easy to use | |||
- that's what "transparency" is - the programmer doesn't know they are | |||
calling code on another system | |||
SOAP (xml rpc), gRPC, protobuf?, GraphQL | |||
- why so many implementations? | |||
- why don't we just use what SUN built? | |||
Did you notice how Sun's RPC mechanism worked at a network level? | |||
- note that in practice nobody used DES for SunRPC...because of export controls | |||
- but there is a fundamental aspect of SunRPC that doesn't play well with modern networks (the portmapper) | |||
- first connect to the portmap daemon, it would tell you which port a service lived on, then you'd connect to the service | |||
- this isn't just firewall unfriendly, firewalls were built to block portmap-based RPC | |||
From the beginning, RPC-based services were a horror from a security perspective | |||
- is this incidental or inherent with RPC? | |||
I say inherent because RPC is meant to be transparent to programmers | |||
- so how does the programmer know they have to enforce security? | |||
- we don't enforce security on every function call! | |||
Security is all about maintaining good boundaries between systems | |||
- RPC works against this goal | |||
How do you define an interface using RPC? | |||
- you just say what functions should be made available over RPC | |||
- even if those functions don't sanitize their inputs | |||
We go through these cycles | |||
- new ways to do RPC | |||
- new ways to block RPC | |||
whenever you see a "firewall friendly" way of doing RPC, it just means that it is bypassing the security boundary of the firewall without regards to security | |||
A key way of securing RPC mechanisms (beyond encryption, authentication, & access control), is to use rich type systems to encode all the invariants that code assumes about its inputs (so input validation can be done automatically) | |||
- but this is insufficient in practice because you always miss invariants, or | |||
some are just hard to encode in the form of types | |||
RPCs are just bad at doing input validation against a malicious party | |||
(but are good at stopping regular errors) | |||
</pre> |
Latest revision as of 04:09, 31 January 2023
Discussion Questions
- Why do RPCs require data to be copied?
- Why is transparency important? Do you think it is still important?
- How does SUN's RPC compare to that described by Nelson?
- Why did SUN make its own RPC mechanism?
- What are alternatives to RPC?
- Do you think RPCs are good for security? Why or why not?
Notes
Lecture 7 --------- Why do RPCs require data to be copied? Why is transparency important? Do you think it is still important? How does SUN's RPC compare to that described by Nelson? Why did SUN make its own RPC mechanism? What are alternatives to RPC? Do you think RPCs are good for security? Why or why not? Where did Nelson work? Xerox PARC! - was involved with Smalltalk, etc What is the main alternative to RPC? - message passing - because that's how networks work What is RPC inspired by? - procedure calls in ALGOL-like languages So what is a procedure call? - gather arguments - transfer control - receive return values Note that when you call a procedure (function), the caller doesn't do anything until the call finishes - turning something that is inherently concurrent into something that is sequential Why do procedure calls make sense locally? - code for a single-threaded process is running on only one core (used to be, there was only one core total) - so there's no problem with the caller waiting while the callee runs, because that's how you share the CPU (they take turns) (natural way of multiplexing the CPU between different parts of a program) - also, caller and callee share address space, instruction set, data formats etc, so easy to share arguments/pass around data - names for code and data are obvious (memory addresses) So when we move to a network, do the assumptions behind local procedure calls hold? - not really - note that there is always at least two CPUs, and the caller CPU has nothing to do while the callee CPU is working (other than do other tasks) - and no shared memory, so you have to copy data between systems that may be fundamentally different So why do RPC? - programmer convenience/easy to use - that's what "transparency" is - the programmer doesn't know they are calling code on another system SOAP (xml rpc), gRPC, protobuf?, GraphQL - why so many implementations? - why don't we just use what SUN built? Did you notice how Sun's RPC mechanism worked at a network level? - note that in practice nobody used DES for SunRPC...because of export controls - but there is a fundamental aspect of SunRPC that doesn't play well with modern networks (the portmapper) - first connect to the portmap daemon, it would tell you which port a service lived on, then you'd connect to the service - this isn't just firewall unfriendly, firewalls were built to block portmap-based RPC From the beginning, RPC-based services were a horror from a security perspective - is this incidental or inherent with RPC? I say inherent because RPC is meant to be transparent to programmers - so how does the programmer know they have to enforce security? - we don't enforce security on every function call! Security is all about maintaining good boundaries between systems - RPC works against this goal How do you define an interface using RPC? - you just say what functions should be made available over RPC - even if those functions don't sanitize their inputs We go through these cycles - new ways to do RPC - new ways to block RPC whenever you see a "firewall friendly" way of doing RPC, it just means that it is bypassing the security boundary of the firewall without regards to security A key way of securing RPC mechanisms (beyond encryption, authentication, & access control), is to use rich type systems to encode all the invariants that code assumes about its inputs (so input validation can be done automatically) - but this is insufficient in practice because you always miss invariants, or some are just hard to encode in the form of types RPCs are just bad at doing input validation against a malicious party (but are good at stopping regular errors)