== Group Discussion on "The Early Web ==

Questions to discuss:

# How do you think the web would have been if not like the present way?
# What kind of infrastructure changes would you like to make?

=== Group 1 ===
: Relatively satisfied with the present structure of the web some changes suggested are in the below areas:
* Make use of the greater potential of Protocols
* More communication and interaction capabilities.
* Implementation changes in the present payment method systems. Example usage of "Micro-computation" - a discussion we would get back to in future classes. Also, Cryptographic currencies.
* Augmented reality.
* More towards individual privacy.

=== Group 2 ===
A large portion of the web serves content that is overwhelmingly concerned about presentation rather than structuring content. Tim Berner-Lees himself bemoaned the death of the semantic web.

* Information to be classified in detail
** Organize things on web. Ex: Yahoo indexers
** Suggestion for the need of Universal Decimal System an idea by Paul Otlet to be considered.
** In the end it comes to semantic web
* Information redundancy
* Information verification

=== Group 3 ===
* What we want to keep
** Linking mechanisms
** Minimum permissions to publish
* What we don't like
** Relying on one source for document
** Privacy links for security
* Proposal
** Peer-peer to distributed mechanisms for documenting
** Reverse links with caching - distributed cache
** More availability for user - what happens when system fails?
** Key management to be considered - Is it good to have centralized or distributed mechanism?

=== Group 4 ===
* An idea of web searching for us
* A suggestion of a different web if it would have been implemented by "AI" people
** AI programs searching for data - A notion already being implemented by Google slowly.
* Generate report forums
* HTML equivalent is inspired by the AI communication
* Higher semantics apart from just indexing the data
** Problem : "How to bridge the semantic gap?"
** Search for more data patterns

= Group design exercise — The web that could be =

* “The web that wasn't” mentioned the moans of librarians.
* A universal classification system is needed.
* The training overhead of classifiers (e.g., librarians) is high. See the master's that a librarian would need.
* More structured content, both classification, and organization
* Current indexing by crude brute-force searching for words, etc., rather than searching metadata
* Information doesn't have the same persistence, see bitrot and Vint Cerf's talk.
* Too concerned with presentation now.
* Tim Berner-Lees bemoaning the death of the semantic web.
* The problem of information duplication when information gets redistributed across the web. However, we do want redundancy.
* Too much developed by software developers
* Too reliant on Google for web structure
** See search-engine optimization
* Problem of authentication (of the information, not the presenter)
** Too dependent at times on the popularity of a site, almost in a sophistic manner.
** See Reddit
* How do you programmatically distinguish satire from fact
* The web's structure is also “shaped by inbound links but would be nice a bit more”
* Infrastructure doesn't need to change per se.
** The distributed architecture should still stay. Centralization of control of allowed information and access is terrible power. See China and the Middle-East.
** Information, for the most part, in itself, exists centrally (as per-page), though communities (to use a generic term) are distributed.
* Need more sophisticated natural language processing.

= Class discussion =

Focusing on vision, not the mechanism.

* Reverse linking
* Distributed content distribution (glorified cache)
** Both for privacy and redunancy reasons
** Suggested centralized content certification, but doesn't address the problem of root of trust and distributed consistency checking.
*** Distributed key management is a holy grail
*** What about detecting large-scale subversion attempts, like in China
* What is the new revenue model?
** What was TBL's revenue model (tongue-in-cheek, none)?
** Organisations like Google monetized the internet, and this mechanism could destroy their ability to do so.
* Search work is semi-distributed. Suggested letting the web do the work for you.
* Trying to structure content in a manner simultaneously palatable to both humans and machines.
* Using spare CPU time on servers for natural language processing (or other AI) of cached or locally available resources.
* Imagine a smushed Wolfram Alpha, Google, Wikipedia, and Watson, and then distributed over the net.
* The document was TBL's idea of the atom of content, whereas nowaday we really need something more granular.
* We want to extract higher-level semantics.
* Google may not be pure keyword search anymore. It is essentially AI now, but we still struggle with expressing what we want to Google.
* What about the adversarial aspect of content hosters, vying for attention?
* People do actively try to fool you.
* Compare to Google News, though that is very specific to that domain. Their vision is a semantic web, but they are incrementally building it.
* In a scary fashion, Google is one of the central points of failure of the web. Even scarier is less technically competent people who depend on Facebook for that.
* There is a semantic gap between how we express and query information, and how AI understands it.
* Can think of Facebook as a distributed human search infrastructure.
* A core service of an operating system is locating information. '''Search is infrastructure.'''
* The problem is not purely technical. There are political and social aspects.
** Searching for a file on a local filesystem should have a unambiguous answer.
** Asking the web is a different thing. “What is the best chocolate bar?”
* Is the web a network database, as understood in COMP 3005, which we consider harmful.
* For two-way links, there is the problem of restructuring data and all the dependencies.
* Privacy issues when tracing paths across the web.
* What about the problem of information revocation?
* Need more augmented reality and distributed and micro payment systems.
* We need distributed, mutually untrusting social networks.
** Now we have the problem of storage and computation, but also take away some of of the monetizationable aspect.
* Distribution is not free. It is very expensive in very funny ways.
* The dream of harvesting all the computational power of the internet is not new.
** Startups have come and gone many times over that problem.
* Google's indexers understands quite well many documents on the web. However, it only '''presents''' a primitive keyword-like interface. It doesn't expose the ontology.
* Organising information does not necessarily mean applying an ontology to it.
* The organisational methods we now use don't use ontologies, but rather are supplemented by them.

DistOS 2014W Lecture 6

2014-01-24T02:12:37Z

Alp: Convert previous content to more idiomatic markup (see pun on semantic web)

DistOS 2014W Lecture 6

2014-01-24T02:11:07Z

Alp: Snippet from my notes.

== Group Discussion on "The Early Web ==

Questions to discuss:

# How do you think the web would have been if not like the present way?
: 2. What kind of infrastructure changes would you like to make?

'''Group 1'''
: Relatively satisfied with the present structure of the web some changes suggested are in the below areas:
* Make use of the greater potential of Protocols
* More communication and interaction capabilities.
* Implementation changes in the present payment method systems. Example usage of "Micro-computation" - a discussion we would get back to in future classes. Also, Cryptographic currencies.
* Augmented reality.
* More towards individual privacy.

'''Group 2'''
A large portion of the web serves content that is overwhelmingly concerned about presentation rather than structuring content. Tim Berner-Lees himself bemoaned the death of the semantic web.

* Information to be classified in detail
** Organize things on web. Ex: Yahoo indexers
** Suggestion for the need of Universal Decimal System an idea by Paul Otlet to be considered.
** In the end it comes to semantic web
* Information redundancy
* Information verification

'''Group 3'''
* What we want to keep
** Linking mechanisms
** Minimum permissions to publish
* What we don't like
** Relying on one source for document
** Privacy links for security
* Proposal
** Peer-peer to distributed mechanisms for documenting
** Reverse links with caching - distributed cache
** More availability for user - what happens when system fails?
** Key management to be considered - Is it good to have centralized or distributed mechanism?

'''Group 4'''
* An idea of web searching for us
* A suggestion of a different web if it would have been implemented by "AI" people
** AI programs searching for data - A notion already being implemented by Google slowly.
* Generate report forums
* HTML equivalent is inspired by the AI communication
* Higher semantics apart from just indexing the data
** Problem : "How to bridge the semantic gap?"
** Search for more data patterns

DistOS 2014W Lecture 5

2014-01-23T12:36:10Z

Alp: Initial raw note dump, redacted

= Introduction =
; Operating system
: The software that turns the computer you have into the one you want (Anil)

* What sort of computer did we want to have?
* What sort of abstractions did they want to be easy? Hard?
* What could we build with the internet (not just WAN, but also LAN)?
* Most dreams people had of their computers smacked into the wall of reality.

= MOAD review in groups =

* Chorded keyboard unfortunately obscure, partly because the attendees disagreed with the long-term investment of training the user.
* View control → hyperlinking system, but in a lightweight (more like nanoweight) markup language.
* Ad-hoc ticketing system
* Ad-hoc messaging system
** Used on a time-sharing systme with shared storage,
* Primitive revision control system
* Different vocabulary:
** Bug and bug smear (mouse and trail)
** Point rather than click

= Class review =

* Doug died Jul 2 2013
* Doug himself called it an “online system”, rather than offline composition of code using card punchers as was common in the day.
* What became of the tech:
** Chorded keyboards:
*** Exist but obscure
** Pre-ARPANET network:
*** Time-sharing mainframe
*** 13 workstations
*** Telephone and television circuit
** Mouse
*** “I sometimes apologize for calling it a mouse”
** Collaborative document editing integrated with screen sharing
** Videoconferencing
*** Part of the vision, but more for the demo at the time,
** Hyperlinks
*** The web on a mainframe
** Languages
*** Metalanguages
**** “Part and parcel of their entire vision of augmenting human intelligence.”
**** You must teach the computer about the language you are using.
**** They were the use case. It was almost designed more for augmenting programmer intelligence rather than human intelligence.
*** It was normal for the time to build new languages (domain-specific) for new systems. Nowadays, we standardize on one but develop large APIs, at the expense of conciseness. We look for short-term benefits; we minimize programmer effort.
*** Compiler compiler
** Freeze-pane
** Folding—Zoomable UI (ZUI)
*** Lots of systems do it, but not the default
*** Much easier to just present everything.
** Technologies the required further investment got left behind.
* The NLS had little to no security
** There was a minimal notion of a user
** There was a utopian aspect. Meanwhile, the Mac had no utopian aspect. Data exchange was through floppies. Any network was small, local, ad-hoc, and among trusted peers.
** The system wasn't envisioned to scale up to masses of people who didn't trust each other.
** How do you enforce secrecy.
* Part of the reason for lack of adoption of some of the tech was hardware. We can posit that a bigger reason would be infrastructure.
* Differentiate usability of system from usability of vision
** What was missing was the polish, the ‘sexiness’, and the intuitiveness of later systems like the Apple II and the Lisa.
** The usability of the later Alto is still less than commercial systems.
*** The word processor was modal, which is apt to confuse unmotivated and untrained users.
* In the context of the Mother of All Demos, the Alto doesn't seem entirely revolutionary. Xerox PARC raided his team. They almost had a GUI; rather they had what we call today a virtual console, with a few things above.
* What happens with visionaries that present a big vision is that the spectators latch onto specific aspects.
* To be comfortable with not adopting the vision, one must ostracize the visionary. People pay attention to things that fit into their world view.
* Use cases of networking have changed little, though the means did
* Fundamentally a resource-sharing system; everything is shared, unlike later systems where you would need to explicitly do so. Resources shared fundamentally sense to share: documents, printers, etc.
* Resource sharing was never enough. '''Information-sharing''' was the focus.

= Alto review =

* Fundamentally a personal computer
* Applications:
** Drawing program with curves and arcs for drawing
** Hardware design tools (mostly logic boards)
** Time server
* Less designed for reading than the NLS. More designed around paper. Xerox had a laser printer, and you would read what you printed. Hypertext was deprioritized, unlike the NLS vision had focused on what could not be expressed on paper.
* Xerox had almost an obsession with making documents print beautifully.

DistOS 2014W Lecture 3

2014-01-14T15:47:56Z

Alp: /* Unclear portions */

Questions to consider:
* What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?
* What sort of resources were shared? What resources are shared today?
* What network architecture did they envision? Do we still have the same architecture?
* What surprised you about this paper?
* What was unclear?

==Group 1==
* video was mostly a summary of Kahn's paper
* process migration through different zones of air traffic control
* "distributed OS" meant something different than we normally think about, because many people would log in remotely to a single machine, it is very much like cloud infrastructure that we talk about today
* alto paper makes reference to Kahn's paper, and the alto designers had the foresight to see that networks like arpanet would be necessary
* would it be useful to have a co-processor responsible for maintaining shared resources even today? Like the IMPs of the arpanet? Today, computers are usually so fast it doesn't really matter.

=== Questions ===

* What were the purposes envisioned for computer networks?
** big computation, storage, resource sharing - "having a library on a hard disk"

* How do those compare with the uses they are put to today?
** those things are being done, but mostly communication like instant messaging, email

* What sort of resources were shared?
** databases, CPU time

* What resources are shared today?
** mostly storage

* What network architecture did they envision?
** they had a checksum and acknowledge on each packet
** the IMPs were the network interface and the routers
** packet-switching

* Do we still have the same architecture?
** packet-switching definitely won
** no, now IP doesn't checksum or acknowledge, but TCP has end-to-end checksum and acknowledge
** Kahn went on to learn from the errors of arpanet to design TCP/IP
** the job of network interface and router have been decoupled

* What surprised you about this paper?
** everything
** how they were able to do this
** a network interface card and router was the size of a fridge
** high-level languages
** bootstrap protocol, bootstrapping an application
** primitive computers
** desktop publishing
** the logistics of running a cable from one university to another
** how old the idea of distributed operating systems is

* What was unclear?
** much of the more technical specifications, but we mostly skipped over those

==Group 2==
1. The main purpose of early networks was resource sharing. Abstraction for transmission. Message reliability was a by-product. The underlying idea is the same.

2. Specialized Hardware/software and information sharing. super set of sharing.

3. AD-HOC routing, it was TCP without saying it. Largely unchanged today.

==Group 3==
===Envisioned computer network purposes===
* Improving reliability of services, due to redundant resource sets
* Resource sharing
* Usage modes:t
** Users can use a remote terminal, from a remote office or home, to access those resources.
** Would allow centralization of resources, to improve ease of management and do away with inefficiencies
* Allow specialization of various sites. rather than each site trying to do it all
* Distributed simulations (notably air traffic control)

Information-sharing is still relevant today, especially in research and large simulations. Remote access has mostly devolved into a specialized need.

===Resources shared===
* Computing resources (especially expensive mainframes)
* Data sets

===Network architecture===
* A primitive layered architecture
* Dedicated routing functions
* Various topologies:
** star
** loop
** bus
* Primarily (packet|mesage)-switched
** Circuit-switching too expensive and has large setup times
** Doesn't require committing resources
* Primitive flow control and buffering
* Predates proper congestion control such as Van Jacobsen's slow start
* Ad-hoc routing or based on something similar to RIP
* Anticipation of elephants and mice latency issues
* Unlike modern internet, error control and retransmission at every step

The architecture today is similar, but the link-layer is very different: use of Ethernet and ATM. The modern internet is a collection of autonomous systems, not a single network. Routing propogation is now large-scale, and semi-automated (e.g., BGP externally, IS-IS and OSPF internally)

===Surprising aspects===

===Unclear portions===
* Weird packet format: Page 1400 (4 of PDF): “Node 6, discovering the message is for itself,
replaces the destination address by the source address

==Group 4==

* What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?

Networks were envisioned as providing remote access to other computers, because useful resources such as computing power, large databases, and non-portable software were local to a particular computer, not themselves shared over the network.

Today, we use networks mostly for sharing data, although with services like Amazon AWS, we're starting to share computing resources again. We're also moving to support collaboration (e.g. Google Docs, GitHub, etc.).

* What sort of resources were shared? What resources are shared today?

Computing power was the key resource being shared; today, it's access to data. (See above.)

* What network architecture did they envision? Do we still have the same architecture?

Surprisingly, yes: modern networks have substantially similar architecures to the ones described in these papers.
Packet-switched networks are now ubiquitous. We no longer bother with circuit-switching even for telephony, in contrast to the assumption that non-network data would continue to use the circuit-switched common-carrier network.

* What surprised you about this paper?

We were surprised by the accuracy of the predictions given how early the paper was written. Also surprising were technological advances since the paper was written, such as data transfer speeds (we have networks that are faster than the integrated bus in the Alto), and the predicted resolution requirements (which we are nowhere near meeting). The amount of detail in the description of the 'mouse pointing device' was interesting too.

* What was unclear?

Nothing significant; we're looking at these with the benefit of hindsight.

DistOS 2014W Lecture 3

2014-01-14T15:43:13Z

Alp: /* Group 3 */

Questions to consider:
* What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?
* What sort of resources were shared? What resources are shared today?
* What network architecture did they envision? Do we still have the same architecture?
* What surprised you about this paper?
* What was unclear?

==Group 1==
* video was mostly a summary of Kahn's paper
* process migration through different zones of air traffic control
* "distributed OS" meant something different than we normally think about, because many people would log in remotely to a single machine, it is very much like cloud infrastructure that we talk about today
* alto paper makes reference to Kahn's paper, and the alto designers had the foresight to see that networks like arpanet would be necessary
* would it be useful to have a co-processor responsible for maintaining shared resources even today? Like the IMPs of the arpanet? Today, computers are usually so fast it doesn't really matter.

=== Questions ===

* What were the purposes envisioned for computer networks?
** big computation, storage, resource sharing - "having a library on a hard disk"

* How do those compare with the uses they are put to today?
** those things are being done, but mostly communication like instant messaging, email

* What sort of resources were shared?
** databases, CPU time

* What resources are shared today?
** mostly storage

* What network architecture did they envision?
** they had a checksum and acknowledge on each packet
** the IMPs were the network interface and the routers
** packet-switching

* Do we still have the same architecture?
** packet-switching definitely won
** no, now IP doesn't checksum or acknowledge, but TCP has end-to-end checksum and acknowledge
** Kahn went on to learn from the errors of arpanet to design TCP/IP
** the job of network interface and router have been decoupled

* What surprised you about this paper?
** everything
** how they were able to do this
** a network interface card and router was the size of a fridge
** high-level languages
** bootstrap protocol, bootstrapping an application
** primitive computers
** desktop publishing
** the logistics of running a cable from one university to another
** how old the idea of distributed operating systems is

* What was unclear?
** much of the more technical specifications, but we mostly skipped over those

==Group 2==
1. The main purpose of early networks was resource sharing. Abstraction for transmission. Message reliability was a by-product. The underlying idea is the same.

2. Specialized Hardware/software and information sharing. super set of sharing.

3. AD-HOC routing, it was TCP without saying it. Largely unchanged today.

==Group 3==
===Envisioned computer network purposes===
* Improving reliability of services, due to redundant resource sets
* Resource sharing
* Usage modes:t
** Users can use a remote terminal, from a remote office or home, to access those resources.
** Would allow centralization of resources, to improve ease of management and do away with inefficiencies
* Allow specialization of various sites. rather than each site trying to do it all
* Distributed simulations (notably air traffic control)

Information-sharing is still relevant today, especially in research and large simulations. Remote access has mostly devolved into a specialized need.

===Resources shared===
* Computing resources (especially expensive mainframes)
* Data sets

===Network architecture===
* A primitive layered architecture
* Dedicated routing functions
* Various topologies:
** star
** loop
** bus
* Primarily (packet|mesage)-switched
** Circuit-switching too expensive and has large setup times
** Doesn't require committing resources
* Primitive flow control and buffering
* Predates proper congestion control such as Van Jacobsen's slow start
* Ad-hoc routing or based on something similar to RIP
* Anticipation of elephants and mice latency issues
* Unlike modern internet, error control and retransmission at every step

The architecture today is similar, but the link-layer is very different: use of Ethernet and ATM. The modern internet is a collection of autonomous systems, not a single network. Routing propogation is now large-scale, and semi-automated (e.g., BGP externally, IS-IS and OSPF internally)

===Surprising aspects===

===Unclear portions===
* Weird packet format: Page 1400 (4 of PDF): “Node 6, discovering the message is for itself,
replaces the destination address by the source address”.

==Group 4==

* What were the purposes envisioned for computer networks? How do those compare with the uses they are put to today?

Networks were envisioned as providing remote access to other computers, because useful resources such as computing power, large databases, and non-portable software were local to a particular computer, not themselves shared over the network.

Today, we use networks mostly for sharing data, although with services like Amazon AWS, we're starting to share computing resources again. We're also moving to support collaboration (e.g. Google Docs, GitHub, etc.).

* What sort of resources were shared? What resources are shared today?

Computing power was the key resource being shared; today, it's access to data. (See above.)

* What network architecture did they envision? Do we still have the same architecture?

Surprisingly, yes: modern networks have substantially similar architecures to the ones described in these papers.
Packet-switched networks are now ubiquitous. We no longer bother with circuit-switching even for telephony, in contrast to the assumption that non-network data would continue to use the circuit-switched common-carrier network.

* What surprised you about this paper?

We were surprised by the accuracy of the predictions given how early the paper was written. Also surprising were technological advances since the paper was written, such as data transfer speeds (we have networks that are faster than the integrated bus in the Alto), and the predicted resolution requirements (which we are nowhere near meeting). The amount of detail in the description of the 'mouse pointing device' was interesting too.

* What was unclear?

Nothing significant; we're looking at these with the benefit of hindsight.