DistOS-2011W User Controlled Bandwidth: How Social Protocols Affect Network Protocols and Our Need for Speed

From Soma-notes
Jump to navigation Jump to search

Abstract

In the past 20 years, advancements in computing have gravitated towards connectivity, specifically the rise of the internet. The last decade (2000 – 2010) has seen a global increase in the number of internet users by over 400%, and as expansion into the Middle East and Africa continues, the total number of people connected to the internet will only grow.[1] As we continue to build computer interfaces which simplify the user experience (iOS, for example) more of the population can make use of computers without knowledge of how a computer works. Specifically, terms like bandwidth or latency mean little to the average computer user in terms of the actual performance of their computer.

This paper will review what distributed computing means to an average user as well as how we can make better use of our networks by providing users with simple to use tools to understand bandwidth and how different applications utilize the finite resource. We will begin with an introduction describing typical internet traffic as well as how users typically interact with the internet. Following the introduction is a look into the need for user controlled bandwidth and what benefits it affords. Next we will examine a particular case study in which households were given a tool to monitor and adjust their bandwidth usage. The subsequent section will examine some current tools for implementing user controlled bandwidth. The final section of the paper provides a discussion on the current state of user controlled bandwidth and what direction it may take in the future.

Introduction

To understand user controlled bandwidth it is important to understand how internet traffic “flows” as well as trends in data movement and user interactions with the internet. A typical user of the internet does not care about concepts such as packets, they simply run applications, like web browsers, completely unaware of the size and duration of their connection to the internet. Most internet traffic is either based on Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). UDP is not a guaranteed delivery protocol, it simply sends packets, regardless of congestion in the network, and does not provide any confirmation of receipt to the sender. Alternatively, TCP provides a delivery system analogous to stream delivery. TCP has a feedback mechanism to control the sending rate, it sends data as fast as it can until there is congestion at which point it scale back the rate, and it also confirms delivery of data.[2]

Due to the congestion control and guaranteed delivery properties, TCP is the delivery method used most frequently on the internet.[3] This makes sense for connections that last a long time, greater than 15 minutes, and/or transmit a lot of data as it allows the congestion control mechanism to provide some form of “fair” sharing of the available bandwidth. However, an analysis of actual internet traffic displays a usage landscape that is not dominated by large, long connections. Most internet streams are very short in duration, less than 15 minutes (98%), and almost half of those streams are less than 2 seconds in duration.[4] This means that the majority of streams on the internet are not actually connected long enough to take advantage of TCP’s congestion control mechanisms.

The remaining 2% of TCP streams, on the other hand, last longer than 15 minutes but account for approximately 50% to 60% of all bytes transmitted. This presents a transmission environment in which we need to consider streams both temporally as well as spatially as these two dimensions have very different affects on the network.[4] It becomes clear why internet traffic follows these patterns when further analysis is focused on how users typically interact with the internet. Although the number of users of the internet has been growing exponentially and similarly the number of web pages on the internet has grown exponentially, typical usage of the internet is fairly conservative. The average user will connect to the internet multiple times but each session is typically short in duration. While connected, users will typically visit a small, well-known group of sites as opposed to exploring new web pages. In fact, users will typically frequent a single website multiple times in a single session.[5]


User Controlled Bandwidth

Is Bandwidth the Correct Measurment for Network Speed?

The average internet user is not concerned with bandwidth or latency; he/she is only interested in how long it will take to complete a task. Typical tasks may include loading a web page, downloading a file, watching a video clip or conversing in a conversation on voice over IP (VoIP). Yet when we purchase an internet package from an internet service provider (ISP), we only get to choose by bandwidth allotment. By marketing internet connections with terms that a network technician would find useful, but to an average user are useless, users are left without a conceptual model for how their internet usage affects their connection “speed”.

Dukkipati and McKeown provide an interesting alternative in Why Flow-Congestion Time is the Right Metric for Congestion Control. A user is not concerned with full utilization of their bandwidth, network fairness or throughput; all they care about is how quickly a task may be completed. Dukkipati and McKeown propose measuring connections based on Flow Completion Time (FCT) which could allow for improvements to the way data is transmitted. As mentioned earlier, typical data streams on the internet do not last long enough to make full use of TCP’s congestion control mechanisms, such as “slow start”. Even worse, because of TCP’s slow start mechanism, many of these short flows have their lives prolonged in multiple orders of magnitude of round trip times (RTTs) causing unnecessary congestion. [6]

In order to combat these issues with TCP, Dukkipati and McKeown suggest adopting a Rate Control Protocol (RCP) which would simulate processor sharing at each router. Using such a protocol each flow would be assigned a single rate and each flow would have the same rate. This rate would be dynamic depending on how many flows are present and the input traffic rate. The benefit of RCP would be that the router no longer needs to maintain per-flow state or per-packet calculations. Furthermore, since there would be no “slow start”, the short streams (which make up 98% of the streams) would no longer have their lifetimes unnecessarily extended.[6]

The issue with altering network connection descriptions from bandwidth to Flow Completion Time is that it is very difficult to guarantee an FCT. However, this may not be such an issue as ISPs routinely advertise bandwidth amounts that are theoretical or are not based on the effect of others on the network sharing the resource. Thus, current ISP descriptions of network “speeds” based on bandwidth may be no more accurate than they would be if based on FCT. The benefit of basing them on FCT, however, is that users would be able to understand what their “expected” rate would be when it is a time limit as opposed to a size limit.

How Can Users Gain Control?

If we are unable to shift the focus of network speed measuring from bandwidths to a completion time related measurement, the question becomes how can networks be presented such that average users will be able to understand the limits of bandwidth? This is an important question not only for general, home-based internet use but also in the scientific community as well. In the scientific realm, users have a better understanding of what bandwidth is and how it relates to their distributed computing experience, but they need some method to setup and control guaranteed bandwidth connections for data transfer. These e-science experiments transfer massive amounts of critical data and bandwidth guarantees are required as well as proper scheduling to ensure connections are utilized as completely as possible.[8]

To support these experiments, scientists are turning to solutions that allow them to setup dedicated channels with guaranteed bandwidth allowances and lifetimes. These solutions provide the scientists with an interface to register channels with various parameters and the program will setup the connection and add it to a list of monitored connections across the network. As bandwidth needs change, the scientists can make requests to alter the parameters of the channels and the tool will accommodate accordingly.[7] In this manner, connections with known minimum parameters can be established such that a minimum completion time is known.

The programs which support the dynamic creation and maintenance of the connections are implemented similar to the way an operating system handles processes. The users make requests to a central component which knows all of the details of all connections. Passed on the request parameters (bandwidth, duration of connection, time of day to create, etc.) the tool schedules the connections using a scheduling algorithm. The details of all connections are also known to the users so they can make their requests based on the known connection schedule. Thus, if a certain experiment requires a larger share of bandwidth then is available in one slot, the scientists may opt to setup a connection during a low use time to request the extra bandwidth. If delaying the creation of the connection to a low period time is not an option, then the scientists need to negotiate bandwidth needs with their colleagues to obtain the necessary resources. In this manner, when bandwidth bottlenecks occur, it is not the scheduling program that resolves the issue, but rather the social interactions between the users of the network.[7][8]

This model is something that can be simplified to a general use case for typical internet users in a home environment. In order for an average user to understand how different types of usage demand differing levels of bandwidth, the users need a tool which presents a comprehensive view of their usage and allows them to alter what applications/users get “guaranteed” levels of bandwidth. Having such a tool, users how do not understand bandwidth would be able to obtain visual data to form a mental model regarding bandwidth and thus control, or setup social protocols among the users of the network, to limit bottlenecks and promote more efficient use of the network.

Case Study: Home Watcher

A tool which allows home internet users a method to view and manage bandwidth consumption is the focus of Who’s Hogging the Bandwidth?: The Consequences of Revealing the Invisible in the Home (Chetty et al.). As more appliances become connected to the internet, managing digital resources like bandwidth is necessary, but first the users must understand how bandwidth is consumed. In Who’s Hogging the Bandwidth?, the research team setup a tool called Home Watcher in several homes composed of various social make-ups (working parents with teenage children, a house of university students, etc.). The Home Watcher tool provided the users with a visual display of the bandwidth usage of each device connected to the network based on the percentage of bandwidth the device was using out of the total available bandwidth. The tool also allowed users the ability to limit the bandwidth usage of any device connected to the network, up to a minimum level of 20%.[9] The focus of the study was to examine how providing visibility into bandwidth consumption would alter the usage patterns of household members as well as how it would affect the knowledge level of the study participants. The study results report that bandwidth management is a complicated socio-technical issue. Before installing the Home Watcher tool, the researchers interviewed the participants and found multiple concerns such as: o People who feel that they may be hogging the bandwidth usually feel guilty about how their usage may be affecting other users. At the same time, they feel like they are being victimized as they were just trying to use the internet just like the others and could not control how much bandwidth they consumed. o People generally had good ideas about why their internet connections were slow at points (multiple users streaming data, too many people all using the resource, congestion because of external users of the same ISP, hardware issues in their own CPU, etc.) they had no way to see or validate any of their claims. Thus, some relationships were strained due to “hogging” accusations. o Most users wanted to see the bandwidth graphically so that they could validate if their ISP is “shaping” their usage or if there are in fact users hogging larger portions of the bandwidth in the house. [9] After installing the Home Watcher tool and allowing the households to utilize it for a few weeks the researchers again interviewed the participants and also analyzed the usage statistics. They found that by making bandwidth consumption visible, household members were allowed to use social interactions to partition the finite bandwidth resource appropriately. [9] For example, household members were willing to change their usage based on the knowledge of what levels of bandwidth others required. If playing a game takes up a lot of bandwidth, someone may be willing to play at a different time then when another user needs the bandwidth for work related activities. Furthermore, increasing the visibility of bandwidth usage actually increased the understanding members of the households possessed as to what bandwidth was and how it is a shared resource. By the end of the study period, even those who were not the “technical” people of the house felt comfortable engaging in discussions of how bandwidth was being divided. The visual display of bandwidth consumption allowed participants to form a mental model of bandwidth using the graphical display. [9] An interesting side effect of making the bandwidth usage visible was that household members were concerned that the usage patterns displayed on the Home Watcher device would cast a negative light on the activities of users. For example, the bandwidth spike for watching a work-related video would be misinterpreted as recreational viewing. This example brings into light how a digital resource like bandwidth can be extended in certain situations to be part of a person’s identity. Thus, social interactions would help resolve bandwidth sharing issues the same way that social interactions clear up misunderstanding between people. [9] In addition to the association of bandwidth usage to personal identities, other complex etiquettes emerged with regards to when it was ok to limit a person’s bandwidth. In many households it was deemed unacceptable to limit bandwidth when someone was on a VoIP call or watching a video as this would immediately degrade the quality of the service. On the other hand, limiting bandwidth usage during a file download was perfectly acceptable as this just delayed the completion time of the download. These etiquettes reflect the capability of social interactions to facilitate appropriate partitioning of network bandwidth. [9]

Tools for Bandwidth Management

Discussion

References