DistOS 2023W 2023-01-30: Difference between revisions

From Soma-notes
Created page with "==Discussion Questions== * Why do RPCs require data to be copied? * Why is transparency important? Do you think it is still important? * How does SUN's RPC compare to that described by Nelson? * Why did SUN make its own RPC mechanism? * What are alternatives to RPC? * Do you think RPCs are good for security? Why or why not?"
 
No edit summary
 
Line 7: Line 7:
* What are alternatives to RPC?
* What are alternatives to RPC?
* Do you think RPCs are good for security?  Why or why not?
* Do you think RPCs are good for security?  Why or why not?
==Notes==
<pre>
Lecture 7
---------
    Why do RPCs require data to be copied?
    Why is transparency important? Do you think it is still important?
    How does SUN's RPC compare to that described by Nelson?
    Why did SUN make its own RPC mechanism?
    What are alternatives to RPC?
    Do you think RPCs are good for security? Why or why not?
Where did Nelson work? Xerox PARC!
- was involved with Smalltalk, etc
What is the main alternative to RPC?
- message passing
- because that's how networks work
What is RPC inspired by?
- procedure calls in ALGOL-like languages
So what is a procedure call?
- gather arguments
- transfer control
- receive return values
Note that when you call a procedure (function), the caller doesn't do anything
until the call finishes
- turning something that is inherently concurrent into something that is sequential
Why do procedure calls make sense locally?
- code for a single-threaded process is running on only one core
  (used to be, there was only one core total)
- so there's no problem with the caller waiting while the callee runs, because
  that's how you share the CPU (they take turns)
  (natural way of multiplexing the CPU between different parts of a program)
- also, caller and callee share address space, instruction set, data formats etc, so easy to share arguments/pass around data
    - names for code and data are obvious (memory addresses)
So when we move to a network, do the assumptions behind local procedure calls hold?
- not really
- note that there is always at least two CPUs, and the caller CPU has nothing to do while the callee CPU is working (other than do other tasks)
- and no shared memory, so you have to copy data between systems that may be
  fundamentally different
So why do RPC?
- programmer convenience/easy to use
- that's what "transparency" is - the programmer doesn't know they are
  calling code on another system
SOAP (xml rpc), gRPC, protobuf?, GraphQL
- why so many implementations?
- why don't we just use what SUN built?
Did you notice how Sun's RPC mechanism worked at a network level?
- note that in practice nobody used DES for SunRPC...because of export controls
- but there is a fundamental aspect of SunRPC that doesn't play well with modern networks (the portmapper)
  - first connect to the portmap daemon, it would tell you which port a service lived on, then you'd connect to the service
  - this isn't just firewall unfriendly, firewalls were built to block portmap-based RPC
From the beginning, RPC-based services were a horror from a security perspective
- is this incidental or inherent with RPC?
I say inherent because RPC is meant to be transparent to programmers
- so how does the programmer know they have to enforce security?
- we don't enforce security on every function call!
Security is all about maintaining good boundaries between systems
- RPC works against this goal
How do you define an interface using RPC?
- you just say what functions should be made available over RPC
- even if those functions don't sanitize their inputs
We go through these cycles
- new ways to do RPC
- new ways to block RPC
whenever you see a "firewall friendly" way of doing RPC, it just means that it is bypassing the security boundary of the firewall without regards to security
A key way of securing RPC mechanisms (beyond encryption, authentication, & access control), is to use rich type systems to encode all the invariants that code assumes about its inputs (so input validation can be done automatically)
- but this is insufficient in practice because you always miss invariants, or
  some are just hard to encode in the form of types
RPCs are just bad at doing input validation against a malicious party
(but are good at stopping regular errors)
</pre>

Latest revision as of 04:09, 31 January 2023

Discussion Questions

  • Why do RPCs require data to be copied?
  • Why is transparency important? Do you think it is still important?
  • How does SUN's RPC compare to that described by Nelson?
  • Why did SUN make its own RPC mechanism?
  • What are alternatives to RPC?
  • Do you think RPCs are good for security? Why or why not?

Notes

Lecture 7
---------
    Why do RPCs require data to be copied?
    Why is transparency important? Do you think it is still important?
    How does SUN's RPC compare to that described by Nelson?
    Why did SUN make its own RPC mechanism?
    What are alternatives to RPC?
    Do you think RPCs are good for security? Why or why not?


Where did Nelson work? Xerox PARC!
 - was involved with Smalltalk, etc

What is the main alternative to RPC?
 - message passing
 - because that's how networks work

What is RPC inspired by?
 - procedure calls in ALGOL-like languages


So what is a procedure call?
 - gather arguments
 - transfer control
 - receive return values

Note that when you call a procedure (function), the caller doesn't do anything
until the call finishes
 - turning something that is inherently concurrent into something that is sequential

Why do procedure calls make sense locally?
 - code for a single-threaded process is running on only one core
   (used to be, there was only one core total)
 - so there's no problem with the caller waiting while the callee runs, because
   that's how you share the CPU (they take turns)
   (natural way of multiplexing the CPU between different parts of a program)
 - also, caller and callee share address space, instruction set, data formats etc, so easy to share arguments/pass around data
    - names for code and data are obvious (memory addresses)

So when we move to a network, do the assumptions behind local procedure calls hold?
 - not really
 - note that there is always at least two CPUs, and the caller CPU has nothing to do while the callee CPU is working (other than do other tasks)
 - and no shared memory, so you have to copy data between systems that may be
   fundamentally different

So why do RPC?
 - programmer convenience/easy to use
 - that's what "transparency" is - the programmer doesn't know they are
   calling code on another system

SOAP (xml rpc), gRPC, protobuf?, GraphQL
 - why so many implementations?
 - why don't we just use what SUN built?

Did you notice how Sun's RPC mechanism worked at a network level?
 - note that in practice nobody used DES for SunRPC...because of export controls
 - but there is a fundamental aspect of SunRPC that doesn't play well with modern networks (the portmapper)
   - first connect to the portmap daemon, it would tell you which port a service lived on, then you'd connect to the service
   - this isn't just firewall unfriendly, firewalls were built to block portmap-based RPC

From the beginning, RPC-based services were a horror from a security perspective
 - is this incidental or inherent with RPC?

I say inherent because RPC is meant to be transparent to programmers
 - so how does the programmer know they have to enforce security?
 - we don't enforce security on every function call!

Security is all about maintaining good boundaries between systems
 - RPC works against this goal

How do you define an interface using RPC?
 - you just say what functions should be made available over RPC
 - even if those functions don't sanitize their inputs

We go through these cycles
 - new ways to do RPC
 - new ways to block RPC

whenever you see a "firewall friendly" way of doing RPC, it just means that it is bypassing the security boundary of the firewall without regards to security

A key way of securing RPC mechanisms (beyond encryption, authentication, & access control), is to use rich type systems to encode all the invariants that code assumes about its inputs (so input validation can be done automatically)
 - but this is insufficient in practice because you always miss invariants, or
   some are just hard to encode in the form of types

RPCs are just bad at doing input validation against a malicious party
 (but are good at stopping regular errors)