DistOS 2014W Lecture 3

The Early Internet (Jan. 14)

Questions to consider:

What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?
What sort of resources were shared? What resources are shared today?
What network architecture did they envision? Do we still have the same architecture?
What surprised you about this paper?
What was unclear?

Group 1

Discussion

The video was mostly a summary of Kahn's paper. It was outlined that process mitigation could be done through different zones of air traffic control. Back then, a "distributed OS" meant something different than we normally think about it now. This is because, when the paper was written, many people would be remotely logging onto a single machine. This type of infrastructure is very much like the cloud infrastructure that we see talk about and see today.

The Alto paper referenced Kahn's paper, and the Alto designers had the foresight to see that networks such as ARPANET would be necessary. However, there are still some questions that come up in discussion, such as:

Would it be useful to have a co-processor responsible for maintaining shared resources even today? Would this be like the IMPs of ARPANET?

Today, computers are usually so fast that it doesn't really seem to matter. This is still interesting to ruminate on, though.

Questions

What were the purposes envisioned for computer networks?

The main purposes envisioned were:

Big computation
Storage
Resource sharing

Essentially, being able to "have a library on a hard disk".

How do those compare with the uses they are put to today?

Today, those things/goals are being done, but we are mostly seeing communication-based things such as instant messaging and email.

What sort of resources were shared?

The main resources being shared were databases and CPU time.

What resources are shared today?

Storage is the main resource being shared today.

What network architecture did they envision?

The network architecture would make use of pack-switching. There would be a checksum and acknowledge on each packet and the IMPs were the network interface, and the routers.

Do we still have the same architecture?

Although, packet-switching definitely won, we do not have the same architecture now as the IP doesn't have a checksum or acknowledge. But TCP does have an end-to-end checksum and acknowledge. Kahn went on to learn fro the errors of ARPANET to design TCP/IP. Also, the job of the network interface and router have now been decoupled.

What surprised you about this paper?

Everything was surprising about this paper. How were they able to dl all of this? A network interface card and router were the size of a fridge! Some general things of note:

High-level languages
Bootstrapping protocols, bootstrapping applications
Primitive computers
Desktop publishing
The logistics of running a cable from one university to another
How old the idea of distributed operating system is

What was unclear? Much of the more technical specifications were unclear, but we mostly skipped over those

Group 2

What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?

The main purpose of early networks was resource sharing. Abstractions were used for transmission and message reliability was a by-product. The underlying idea is the same.

Specialized Hardware/software and information sharing. super set of sharing.

The AD-HOC routing was essentially TCP without saying it. Largely unchanged today.

What sort of resources were shared? What resources are shared today?

What network architecture did they envision? Do we still have the same architecture?

What surprised you about this paper?

What was unclear?

Group 3

What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?

The purposes envisioned for computer networks were:

Improving reliability of services, due to redundant resource sets
Resource sharing
Usage modes:t
- Users can use a remote terminal, from a remote office or home, to access those resources.
- Would allow centralization of resources, to improve ease of management and do away with inefficiencies
Allow specialization of various sites. rather than each site trying to do it all
Distributed simulations (notably air traffic control)

Information-sharing is still relevant today, especially in research and large simulations. Remote access has mostly devolved into a specialized need.

What sort of resources were shared? What resources are shared today?

The main resources being shared were computing resources (especially expensive mainframes) and data sets.

What network architecture did they envision? Do we still have the same architecture?

They envisioned a primitive layered architecture with dedicated routing functions. Some of the various topologies were:

star
loop
bus

It was also primarily (packet | message)-switched. Circuit-switching was too expensive and had large setup times and this didn't require committing resources. There was also primitive flow control and buffering, but no congestion control.

This network architecture predated proper congestion control, such as Van Jacobsen's slow start. The routing was either Ad-hoc or based on something similar to RIP. They would anticipate elephants and have mice latency issues. Unlike the modern internet, there was error control and retransmission at every step.

The architecture today is similar, but the link-layer is very different: use of Ethernet and ATM. The modern internet is a collection of autonomous systems, not a single network. Routing propagation is now large-scale, and semi-automated (e.g., BGP externally, IS-IS and OSPF internally)

What surprised you about this paper?

What was unclear?

The weird packet format: Page 1400 (4 of PDF): “Node 6, discovering the message is for itself, replaces the destination address by the source address"

Group 4

What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?

Networks were envisioned as providing remote access to other computers, because useful resources such as computing power, large databases, and non-portable software were local to a particular computer, not themselves shared over the network.

Today, we use networks mostly for sharing data, although with services like Amazon AWS, we're starting to share computing resources again. We're also moving to support collaboration (e.g. Google Docs, GitHub, etc.).

What sort of resources were shared? What resources are shared today?

Computing power was the key resource being shared; today, it's access to data. (See above.)

What network architecture did they envision? Do we still have the same architecture?

Surprisingly, yes: modern networks have substantially similar architectures to the ones described in these papers.

Packet-switched networks are now ubiquitous. We no longer bother with circuit-switching even for telephony, in contrast to the assumption that non-network data would continue to use the circuit-switched common-carrier network.

What surprised you about this paper?

We were surprised by the accuracy of the predictions given how early the paper was written — even things like electronic banking. Also surprising were technological advances since the paper was written, such as data transfer speeds (we have networks that are faster than the integrated bus in the Alto), and the predicted resolution requirements (which we are nowhere near meeting). The amount of detail in the description of the 'mouse pointing device' was interesting too.

What was unclear?

Nothing significant; we're looking at these with the benefit of hindsight.

Summary of the discussion from lecture

Anil's view is that even these days we can imagine Computer Networks as more of a resource sharing platform. For example when we access the web or search Google we are making use of the resource sharing facilitated by the Internet(Network of interconnected Computer Networks). It's not possible to put 20,000 computers in our basements’, instead the Internet facilitates access to computing power/databases which are built of hundred thousands of computers. In fact Google and other popular search engines has a local copy of the entire web in their data centers, centralized copy of a large distributed system. Kind of a contradictory phenomenon if you think about in terms of the design goals of the distributed system.

Another important takeaway from the discussion was the point that "Early to market/ first player" with a new product/solution to a niche problem and the one which offer solutions based on simple mechanisms as opposed to one relying on complex mechanism gets adopted faster. Classic example is the Internet. ARPANET which was supposed to be an academic research project which was based on simple mechanisms, open and first of its kind got adopted widely and evolved in to the Internet as we see it today. It is to note that this approach is not without its own drawbacks example being the security aspects were not factored in while designing the ARPANET since it was intended to be a network between trusted parties, which was fine then. But when ARPANET evolved in to the Internet, security aspect was one area which required a major focus on. In Silicon Valley the focus is on being the "first player" in a niche market to meet that objective often simple framework/mechanisms are used. In doing so there is also a possibility of leaving out some components which can turn out to be a vital missing link, recent example being security flaw in 'snapchat' that lead to user data being exposed.