A link to the paper: Difference between revisions
AbdelRahman (talk | contribs) |
AbdelRahman (talk | contribs) |
||
Line 177: | Line 177: | ||
For starters: Jurisdiction. This frame work assumes the presence of a globally trustful entity(s) (e.g., government). This entity should act as the Internet's law enforcement, which will be deemed as the primary inspector and also the jurisdiction for regulating all kinds of cyber crimes and misbehavior. This entity may either be centralized or distributed. A centralized entity would be easier to deploy, but it will suffer from a single point of failure. A distributed entity would obviously perform better as it will be able to scale with the growth of the system users as well as conform to diverse regional laws, regulations, customs and traditions. | For starters: Jurisdiction. This frame work assumes the presence of a globally trustful entity(s) (e.g., government). This entity should act as the Internet's law enforcement, which will be deemed as the primary inspector and also the jurisdiction for regulating all kinds of cyber crimes and misbehavior. This entity may either be centralized or distributed. A centralized entity would be easier to deploy, but it will suffer from a single point of failure. A distributed entity would obviously perform better as it will be able to scale with the growth of the system users as well as conform to diverse regional laws, regulations, customs and traditions. | ||
Secondly: GDDB. We assume that a GDDB is deployed, which acts as a "database" for storing ISs. Symmetric key encryption should be used to protect that system as it will only be accessed through two types of users. Routers, which should be able to access this database ONLY for read operations. And the trustful entity (defined in the previous assumption) which should be able to access for read/write operations. Both users must be strictly authenticated, for being able to decrypt the contents or to append. In addition, this distributed system must guarantee almost zero latency in the read operations as it will be heavily relied on for every single hop made by a packet at the Internet intermediate system. A standardization protocol would be required to define the syntax and semantics as well as the nature of the way | Secondly: GDDB. We assume that a GDDB is deployed, which acts as a "database" for storing ISs. Symmetric key encryption should be used to protect that system as it will only be accessed through two types of users. Routers, which should be able to access this database ONLY for read operations. And the trustful entity (defined in the previous assumption) which should be able to access for read/write operations. Both users must be strictly authenticated, for being able to decrypt the contents or to append. In addition, this distributed system must guarantee almost zero latency in the read operations as it will be heavily relied on for every single hop made by a packet at the Internet intermediate system. A standardization protocol would be required to define the syntax and semantics as well as the nature of the way that the GDDB subsystems would communicate with. | ||
Thirdly: Ownership. We assume that every Md should be officially owned by a human. This owner is deemed as the official responsible for that Md, and who would also be accused if his Md found to misbehave or to launch malicious packets. The owning relationship between persons and machines is one-to-many. That is to say, a person can officially own one or more machines but a machine can only be owned by one person. | Thirdly: Ownership. We assume that every Md should be officially owned by a human. This owner is deemed as the official responsible for that Md, and who would also be accused if his Md found to misbehave or to launch malicious packets. The owning relationship between persons and machines is one-to-many. That is to say, a person can officially own one or more machines but a machine can only be owned by one person. |
Revision as of 07:36, 11 April 2011
Title
Proposed titles:
- Requirements for Attribution on the Internet
- Internet Attribution: Between Privacy and Cruciality
Abstract
Present and past situations show a need for improved attribution systems, and arguably, scientific basis for properly functioning attribution systems are not yet defined. Lots of research have been focusing on attributing documents to authors for the sake of securing authorship rights and rapid identification of plagiarism. Many of those were revolving around the notion of using machine learning for linking articles to humans. Others proposed text classification and feature selection as a mean of detecting the author of a document. Unfortunately, not that much research is addressing the problem of lack of robust attribution system over the internet. Authentication, as a mean of attribution, has proved its efficiency but, needless to say, it is not applicable to authenticate every single packet hopping over the intermediate systems. This paper presents limits and advances in the attribution of actions to agents over the internet. It reviews current attribution technologies as well as the limits of those technologies. It also identifies the requirements of a proper attribution system and proposes a distributed (yet cooperative) approach for performing attribution over the internet.
Introduction
Currently the Internet infrastructure provides users partial anonymity. Unfortunately, that anonymity weakens the security for its users, because it incites advance users to exploit that feature. The lack of online identification married with bad intentions entices criminals to commit a number of Cyber Crimes without being caught, crimes which include: fraud, theft, forgery, impersonation, the distribution of Malware (and hence, botnets), traffic tampering, DoS, bandwidth hogging, etc. Consequently, internet attribution is a highly sensitive field that constitutes a cornerstone position within internet security. Needless to say, current solutions don't guarantee sufficient attribution nor are considered always applicable in most of the time, hence, current system suffers the lack of a relatively robust attribution mechanism. In the light of this context, we need better methodologies for reaching an acceptable success level for attributing actions to persons.
In principle, attribution can be defined as the mechanism of binding a system-defined act to an agent. An agent is typically an entity that has the ability to commit what constitutes an act. Within our focus, an agent could either be a person or a machine. It can also be defined as "determining the identity or location of an attacker or an attacker’s intermediary"<ref> [Institute for Defense Analyses, 2003]</ref>. Problems like IP address spoofing, lack of interoperability in intermediate systems, dynamic nature of IP addresses, unawareness of system users with lots of unknown packets sneaking to their machines and poor efficiency of firewalls and IDSs make this determination operation considerably difficult. In addition, some types of attacks are carried out to conceal the real agent behind an act. For instance, malware distribution (and hence the creation of botnets), and stepping stones aim to inflict vagueness around the correct human source behind the scene.
In this paper, we focus on defining what it takes to achieve an acceptably working attribution mechanism over the internet. To do that, we review past research works in attribution and discuss their common limitations as well as flaws and what can be done in common to enhance such schemes. We also argue that the lack of a globally deployed registration system that registers system users and grants them LICENSED access to the system enfeebles proper attribution and motivates illegitimate intrusions and irregular behavior. We show that employing the mentioned system would reduce the incentive of irregular behavior as well as remove the blaze of tempting anonymity, putting attackers under the risk of being easily caught. We also discuss how privacy, as a counter force to attribution, plays a big role in the internet and within its users and propose a framework that achieves relatively robust attribution mechanism and retains the privacy of users.
Much of the research done in literature focuses on attribution that is done for keeping track of authorship, i.e., attributing text to authors. In this paper, we don't question the cruciality of attribution in this field, but rather we address a higher level of attribution of all possible actions to agents, which is sadly deemed slightly obsolete from the current research perspective.
This paper starts by a quick discussion on the dilemma of attribution, resolving the tension between attribution and privacy. Consequently, section 3 argues about the reasons behind the essentiality of implementing proper attribution systems. Section 4 presents a fundamental set of requirements for achieving an acceptable level of attribution over the internet as well as proposes an abstract framework for achieving attribution. In section 5, a review on the currently implemented systems that achieve attribution is presented as well as flaws and points of failure of the surveyed papers. In section 6, the reasons behind the difficulty of achieving a proper attribution system. And finally, a conclusion is presented in section 7.
What is Attribution
The act of attributing, especially the act of establishing a particular person as the creator of a work of art.<ref> The American Heritage® Dictionary of the English Language, Fourth Edition copyright ©2000 by Houghton Mifflin Company. Updated in 2009. Published by Houghton Mifflin Company.</ref>
We are concerned with one particular type of attribution - binding an act to a person. This may include intermediate attributions, for example, an act to an agent (software, device, etc.) and then attribution of an agent to person. Narrowing the problem further, we're only concerned about attribution in large, dynamic networks, like internet. For sake of simplicity, in this paper we're going to reference to "binding an act to a person on the internet" as "attribution", while other types of attributing will be defined separately.
Problem Statement
The anonymity of the Internet makes it virtually impossible most of the time to properly identify who is who online. This is a double-ended sword as it not only provides a high level of privacy, but also makes it hard to identify people with malicious intent and cyber attackers.
Problem Motivation
In todays world there grows a strong need for attribution over the Internet mainly due to increased numbers of cyber attacks since its introduction in the 90’s. Many attackers have succeeded in causing both physical and financial damage to many companies over the Internet and going Scott free; due to the anonymity of the Internet, the attackers cannot be identified.
Scope
In this paper we are addressing the issue of attribution by providing a list of requirements that need to be met in order to have a fully stable and efficient attribution system over the Internet.
Although this is not a technical paper, in order to fully understand some of the concepts and terminology within this paper, a small knowledge of computer science or computer systems will be required.
Background
The problem of attribution is not one that just came up; it has been around fro decades but mostly to address identification issues as it pertained to websites or Internet service providers. A lot of different approaches towards attribution have been taken but mainly just to the extent of what that particular system stems to achieve. This section gives an introduction to three of todays current attribution systems and discusses their pros and cons as they pertain to the type of global attribution we discuss in this paper.
Cookies
Websites will sometimes need to remember information about a visit or a visitor in order to improve viewer experience. Cookies are text files that are created by a web server, and stored by a web browser on the users computer. Cookies are used for many reasons; mainly authentication, remembering shopping cart information, and storing site preference. In actuality, they can be used to store any type of information that can be stored in a text file. When a page is requested the users web browser sends the request with the webserver’s cookie in the header part of the packet. All this is an automated process between the web browser and web server.
If at all the webserver receives a request without a cookie attached to it, it takes it as the first access that the browser is making to the server and sends one as part of the response which will be saved by the browser and resent on the next request. Cookies are usually encrypted in order for data security and information privacy, however they are still subject to the users control as they can be decrypted, modified and even deleted completely. It is also possible for a user to change their browser setting to not accept cookies at all.
Cookies can either have an expiration date or not; this is the date that the browser deletes the cookie. Cookies without an expiration date are deleted when the browser is closed. Some browsers allow you to automatically set how long you want cookies to be stored.
Cookies as an Attribution System
Looking at cookies as the type of attribution system we are looking for over the Internet, we will be able to achieve high precision on identifying computers that access a web server. However, The biggest draw back cookies have is that they can be deleted and manipulated. As such the use of cookies is not an effective attribution system.
IP Addresses
IP or Internet Protocol Addresses are 32 bit numerical identifiers for devices (ie computer, printer, scanner etc..) on a network. The users Internet Service Provider (ISP) provide this number. The Internet Assigned Numbers Authority (IANA) is responsible for managing IP Address space allocation globally. It does this with the help of five Regional Internet registries (RIRs), responsible for allocating IP address blocks to their assigned regions ISPs, who in turn allocates them to their users.
Any device that goes online and communicates using IP needs an IP address. Over the years there has been a growing number of users going online and a number of devices owned by the users to go online. One of the more common examples of this is the increase of Internet ready mobile phones. Our current addressing system used by our current Internet Protocol Version 4 (IPv4) contains only 32bits which means it is only able to uniquely address 232 addresses (4,294,967,296) which is less that the number of people on this planet today. The very last batch of IP addresses was assigned out to the five RIRs early February 2011 [1]. This was foreseen since the 90s, which sprung the development of a new Internet Protocol version, IPv6, which uses a 128bit addressing system began.
IP addresses can either be static or dynamic. Static IP address is an address permanently assigned to a user due to configuration. A dynamic IP address is one in which a new address is assigned at every boot up. A Dynamic Host Configuration Protocol (DHCP) Server is usually responsible for assigning dynamic IP addresses to users. There are two main advantages for dynamic addressing; it eliminates the administrative cost involved with assigning static IP addresses, and it helps solve the issue of limited addressing space by allowing many devices “share” a single address if they go online at different times. Given the limited addressing space the ISPs have to work with, and to save administration costs, most ISPs assign dynamic IP addresses as standard a offer static IP address for a higher fee.
IP Addresses as an Attribution System
Although Internet addresses can be used to attribute packets to its sender, it will fail as an effective attribution system for a few reasons, but mainly that attackers can spoof their IP addresses. Spoofing IP addresses will even foil the efforts of IP trace backs.
Authentication Systems
In order for a website to make sure of the identity of who ever is visiting some pages, it provides an authentication system. This is usually a login name and password either assigned by the web server or chosen by the user. The biggest advantage this has is that now, attribution can be performed across different computers. The task of storing and securing login information is left to the web server, which is subject to attackers hacking into the server to stealing login information.
Login systems are attached to user accounts that sometime require private information in order to be setup. If the web server’s security is not good enough, security breeches may in turn lead to identity theft.
The process behind authentication systems is simple; using a typical web banking authentication system for instance, the process may go as follows. A user requests for a web account or one is automatically assigned to the user. The user sets up a password for accessing the account. When the user now goes to the website he is requested to “identify himself”, the user enters in his personal login information, the web server verifies this information with what it has stored in its database and either grants or denies access to the users personal page.
Authentication systems are only ever used when it involves the users wanting some privacy on the webserver, or when the user wishes to store some form of information on the web server.
Authentication Systems as an Attribution System
Authentication systems are very precise in the identification of people over the Internet and as such used by many companies. However it will have a serious privacy drawback if it were to be used as a global identification system. It will mean that virtually every web server will need to hold enough information about you to be able to identify you as an attacker. This would mean that to even a user randomly search for a cooking recipe online would need to login somehow to access the web server. People generally like the anonymity of surfing the web and a system like this will completely destroy this.
The attribution dilemma
Designing an attribution system is not a trivial task, because, regardless of technologies and/or infrastructure available, one needs to consider controversial question of balancing between strong attribution and privacy. This hypothetical line between attribution and privacy is not straight, and crucially depends on application. For instance, large financial institutions as well as its clients are interested in strong attribution system, which would solve many authorization and authentication problems, as well as will guarantee (to some degree) that agents of transactions are who they claim they are. On the other hand, political dissidents and whistle-blowers do exist primarily because there is no 100% effective attribution system in place and it is possible for them to distribute information (regardless of actual usefulness or goodness of it) and keep their identity secret. It is clear that single universal set of rules cannot satisfy these two cases. It is also clear that, in pretty abstract fashion, privacy is inversely proportional to attribution. While designing an attribution system one needs not only to decide on this ration for some particular case, but rather make this ratio dynamically changed depending on the case.
Assuming this ratio is found, another question is when to decide to use private information to track or punish a person, as to directly intrude their privacy? One might think that this question is a little bit out of the scope of our paper. This is true, however, these and a lot of less obviously related questions should be answered prior to designing, because in such an important thing as protection and privacy, designing of solution should not make too many assumptions and should guarantee something not only to operators of the system, but for users as well. In other words, even though system should be dynamic and adaptable to all potential use cases, it should remain universal to some extent and guarantee some law-related and moral principles.
(here go other questions. will show connection to requirements)
- While designing an attribution system one needs to consider balancing between attribution and privacy.
- Sometimes non-attribution is very crucial,to protect political dissidents and whistle-blowers
- When to decide to track a person and when not to (so as not to intrude privacy)?
- How to make sure attribution is properly achieved?
- Who should attribute who/what and why?
- How far can we trust IP-traceback, stepping stone authentications, link identifications and packet filtering in wedging packets to agents?
- How much can intermediate systems' cooperation contribute to achieving attribution?
- Should there be consequences upon attributing an action(s) to an agent? What are they? (punishment, rewarding, etc)
- How to deal with misleading data sources hiding behind botnets and concealing identities via stepping stones?
Why do we need Attribution
- For identifying purposes
- Web Banking
- eCommerce
- Web advertisements
- For better protection against cyber attacks:
- DoS and DDos
- Forgery and theft
- Sniffing private traffic
- Distributing illegal content/malware
- Sending spam
- Illegal/undesired intrusion
- For marketing purposes (privacy?)
- custom (client-based) content generation
Why is it difficult to achieve attribution?
The main problem I see is that the way Internet is designed makes it possible and relatively easy to act without compromising identity. Moreover, most current solutions are based on the same structure and work within the same scope, thus, can only reduce the number of potentially destructive acts or just deal with the consequences. Of course, no system can prevent 100% of destructive attempts, but some potentially good attribution system should make such attempts highly undesirable and "costly" for an attacker.
- The issue of lack of attribution on the web mostly arises whenever security is compromised. When you're bombarded with spam, or when a system is under a DoS attack attribution becomes a more appealing notion. Getting a balance between security and privacy is tricky, because once attacks are tracked so will all other traffic.
- Depending on the type of sender and receiver, different attribution policy will be requested.
In the ideal world, every action on the internet could be bound to a machine and thus to a person. This is done by examining the source IP printed on each moving packet, locating the geographical location of this IP, consulting the ISP covering the location and identifying the person. If an act requires strict attribution (like checking and sending emails), authentication is used. Here is what goes wrong:
- IP addresses can be spoofed and hence, misleads the geographical location.
- For avoiding that problem, IP traceback can be performed BUT it requires global cooperation of intermediate systems... it is not there!
- IPs are not permanently bound to personnel, so figuring out the person from the IP is not concrete.
- Network users are not aware of all packets sneaking to their machines, which allows for malware distribution and hence, the creation of botnets... misleading attribution!
- Firewalls and packet filters can be used for avoiding that problem, but they are not 100% efficient.
- It is not applicable to authenticate every single action on the internet.
Attacks to prevent correct attribution of actions
- Stepping stone attack: a common way of attributing attacks to anonymity by using multiple public random agents (as stepping stones) to reach the victim in order to conceal the attacking source. <ref name="ref1">S. Staniford-Chen and L. T. Heberlein. Holding intruders accountable on the internet. In SP ’95: Proceedings of the 1995 IEEE Symposium on Security and Privacy, page 39, Washington, DC, USA, 1995. IEEE Computer Society.</ref>
- Forgery
- Identity theft (impersonation)
- Distribution of malware
Requirements for internet attribution system
It is hard to describe some hypothetical attribution system in detail, because there are many issues and complicated dependencies, and a lot of questions to answer or at least to try to answer before one can even think of implementing such a system. In this section we are trying to define high-level requirements for a good attribution system, while definition of good attribution system is not so clear, we take into account everything we have talked above. That is, the following requirements try to define the system in a way that avoids current problems, achieves high degree of attribution and remains realistic.
We have separated those requirements in three sections: general requirements define the idea and overall goal of the system in high level, abstract terms. Deployment requirements set ground rules for deployability that makes sense in such a huge network as internet and human society Practice requirements define the way system works, behaves and interacts with other bodies.
General
First and most obvious requirement for any system is usually put for sake of consistency: main requirement for internet attribution system is simple: it needs to attribute, or, more formally, any potentially destructive act should be traceable to an agent (person and/or organization, group, etc). It is important to consider different natures of agents, because the goal of attribution system is not necessarily to narrow down the search to one particular person, but rather to find the body responsible for an act(s) regardless of their actual structure. It might be one person after all. In other words, even though actions are done mostly by a single person, they are not necessarily the ones who's responsible for a decision to do so. A good real world analogy is an assassin and some body (person or a group) paying him. Good attribution system should not lead to assassin alone, but rather should be designed the way that responsible bodies are the ones to be discovered. Yet, we accept the notion that by the end of the day there is some person or several persons, human brings, responsible for an action. It is essential, because, as practice shows, for example, determining the source of DoS-attack is relatively simple, but most of the time this source is not the one who is responsible, but rather a victim itself.
It is easy to imagine a system in which less crime and misuse is the only acceptable way to do things, and many writers and movie directors exploit this idea in futuristic, science fiction and anti-utopian plots. Unfortunately, applying any of this sort of ideas to real world today is not possible, because a lot of laws and moral principles are already in place, some of which are not perfect, but widely accepted and most have reasons to exist. Attribution system that we're looking for should take legal and moral issues into account, naturally, should not violate and/or contradict any of them. This important requirement comes somewhat together with incremental deployability that we're going to discuss later.
In general, an attribution system should be universal and global, and details of these terms will be discussed later.
Deployment
It is relatively easy to just design a system, it is much harder to design a system, deployment of which need not be instant and massive. Even though a global attribution system will have a lot of pressure on it, internet should not depend on it entirely and in case attribution system goes down, underlying network should still remain functional. In other words, attribution system should be loosely coupled to the system it works in.
As discussed before, (and this could be said about any global system on the internet) such a system should be incrementally deployable, so that smooth, step-by-step, subnetwork-by-subnetwork integration is possible. This is important not only because of virtual impossibility to restart or reconfigure the whole internet at once. This incremental way of embedding an attribution system should be more secure (bugs in software and mistakes in design can be fixed while on a small scale), so that by the end of cycle, when the whole internet is wired, the attribution system is field-tested and analyzed several times by different bodies.
Very important, but controversial subject, is adoption of the system within some set of rules or laws (state laws, government regulations, corporate rules and principles, etc.). System should allow easy adoption for different cases, at the same time it should remain universal and global. It should act like a public tool any group can use, but nobody should be able to misuse it or use in non-legal way. The big decision designers will have to make is the one regarding this line between dynamic adoptability and universality. Luckily, this sort of deepness goes beyond the scope of our paper.
Companies and organizations sometimes loose millions of dolars due to attacks and other cyber-crimes done to them, and some issues can be dealt with by spending more resources (memory, bandwidth on servers, etc), or, in other words, spending more money. The overall cost of setting up and maintaining the attribution system for a particular body (person, organization, network) should be considerably less than average losses under current lack of attribution (e.g. DoS, identity theft, etc).
Practice
Attribution mapping should not be a bijection, in other words action should map to person, but not vice versa. That it, nobody should be able to use the system to answer the question "what person X did/does?". Not only the system should not know the answer, it should not be possible to know the answer; the question "who did act X" is the one should be answered. This can be thought of as part of requirement about not violating current laws and moral principles, but is put as a separate requirement, since it is very important to draw a line between attribution system and surveillance. The goal is not only to make attribution system attribute, but also to make it impossible to use it in other way – for surveillance, spying, etc.
Since this global system operates on the internet, it might not be a great idea to put names of persons into traceability database. It makes much more sense to put some unique IDs for any body, who uses the network, and in case a crime committed, or, in general, it is a case where an agent of some act should be determined, recorded ID will be searched for in police or government database. It should be some trusted entity (government, corporation, police, some public good-like system, etc) that stores the mapping between IDs and real names. This mapping should only be revealed when needed and when there is enough evidence or motivation to do so. Traceability information (namely, unique iDs) should be distributed and it is crucial to make it impossible to collect all the information in one place.
Of course, it is not always the case that some trusted (by everyone) body exists, but generally we have governments and/or agencies we trust. It is important to divide the information between public and trusted body in a way allowing them to cooperate in time of need and not allowing to misuse the system from any side.
Proposed Framework
In this section, we will propose a potential framework and argue that it is able to fulfill the requirements listed in the former section. The proposed framework works under the core principle "An act cannot use network resources nor can it be routed if it is anonymously bound". Firstly, we start by defining some terminologies that will be used within the scope of this section:
- Agent (Ag): the human-device pairing that sits on an end system and keeps transmitting/receiving packets.
- Machines/Devices (Md): any piece of hardware that has access capability. It can either be a PDA, a laptop, a notebook, a PC, an Network Interface Card, or even a mere home made chip that can externally communicate wired or wireless to send or receive digital packets.
- Identification Stamp (IS): a series of bits that binds a human unique identification (iris intricate structure or fingerprint) with a unique feature of an Md. So, for an Md like a Network Interface Card, the MAC address would be that feature. This biding is a particular representation for the official owner of the device, and who is deemed the primary responsible for any outgoing packet launched by his owned device. In other words, it is a unique identifier for an Ag.
- Intermediate System Services (ISS): Services provided by intermediate systems (routers). For e.g., routing (main service), error checking, etc.
- Globally Distributed Database (GDDB): a global DNS-like world-wide distributed storage system with an encrypted LUT that has relatively fast retrieval and update capabilities. It will be used to store ISs.
- Licensing: a process of giving the permission to intermediate systems to provide ISS to all packets that are launched from the agent that is requesting the license. This process is simply adding new ISs to the GDDB.
In principle, every leaping packet has a human owner that is either directly or indirectly responsible for it. Directly responsible when he is running an application that sends requests or initiates communication sessions to another end system. E.g., using the client side of applications supporting the protocols: HTTP, FTP, SIP, RTP, VoIP, etc. Indirect responsibility is when a user is running a system in the background that performs external (over the internet) system calls or is automated for periodic communication or automatic response to incoming requests. E.g., system clock synchronization (NTP), or the server side of the protocols: HTTP, FTP, etc. In addition, indirect responsibility also includes the responsibility of all packets launched by lower layer protocols that are being manipulated by higher layer ones. E.g., when a user sends an HTTP request, TCP sends connection initiation packets for handshaking schemes, ICMP packets that aims to seek and identify that status of a specific host, etc.
The scope of this framework only addresses attribution over the internet and not any other "locally" defined networks underlying the IEEE standard definitions of the topologies PAN, LAN, MAN, or a WAN that do not use the global intermediate systems as their underlying infrastructure for packet delivery.
The following sections show the assumptions for this framework to operate, the methodology of its operation, a list of pros, cons and vulnerabilities of the system and wrap up by a discussion on the proposed framework.
Assumptions
For starters: Jurisdiction. This frame work assumes the presence of a globally trustful entity(s) (e.g., government). This entity should act as the Internet's law enforcement, which will be deemed as the primary inspector and also the jurisdiction for regulating all kinds of cyber crimes and misbehavior. This entity may either be centralized or distributed. A centralized entity would be easier to deploy, but it will suffer from a single point of failure. A distributed entity would obviously perform better as it will be able to scale with the growth of the system users as well as conform to diverse regional laws, regulations, customs and traditions.
Secondly: GDDB. We assume that a GDDB is deployed, which acts as a "database" for storing ISs. Symmetric key encryption should be used to protect that system as it will only be accessed through two types of users. Routers, which should be able to access this database ONLY for read operations. And the trustful entity (defined in the previous assumption) which should be able to access for read/write operations. Both users must be strictly authenticated, for being able to decrypt the contents or to append. In addition, this distributed system must guarantee almost zero latency in the read operations as it will be heavily relied on for every single hop made by a packet at the Internet intermediate system. A standardization protocol would be required to define the syntax and semantics as well as the nature of the way that the GDDB subsystems would communicate with.
Thirdly: Ownership. We assume that every Md should be officially owned by a human. This owner is deemed as the official responsible for that Md, and who would also be accused if his Md found to misbehave or to launch malicious packets. The owning relationship between persons and machines is one-to-many. That is to say, a person can officially own one or more machines but a machine can only be owned by one person.
Finally: IP packets. Our proposed frame work assumes that within the frame format of the IP packets, a header is added by the network layer that includes the "identification stamp" of the packet owner. A packet owner is the person PLUS the machine that are together responsible for launching this packet.
Methodology
Basically, this framework works by stalling the propagation of a packet that is either unattributed or forged with a fake identification stamp. A "fake identification stamp" is defined as:
- Either having a false unique chip identifier that refers to an imaginary device.
- Or having a false unique human identifier that refers to an imaginary human.
- Or having a misleading binding of a human to a machine. i.e., claiming that some machine "X" belongs to some human "Y", but in reality, "Y" is not the owner of "X".
Noticeably, the routers, as primary constituents to the intermediate systems, should refrain from routing any data packets that are not fully attributed. As they are the main driving power behind delivering all malicious or benign packets, they should have great responsibility in achieving highly reliable attribution mechanism.
A description of the system, based on the chronological order, is as follows. First, any newly bought machine or even a home made device, must be licensed from the trustful entity. The trustful entity access the globally distributed database of "identification stamps" and adds the new identification stamp of the agent that asked for license. If a device is not licensed (i.e., its "identification stamp" was not inserted to the distributed database), it doesn't not benefit from ISS.
From the intermediate system's perspective, when a router receives a packet, it verifies its IS. This is done by consulting the GDDB and sending it a copy of the IS found on the packet. If a packet founds to be not having an IS, the packet is prevented from benefiting from ISS and is simply dropped. If the GDDB replies with an invalid IS, again, the packet is dropped. If the GDDB replies with a success, this will mean that the packet's printed IS is verified. Thus, the packet benefits from the ISS and gets routed throughout the way.
Pros, Cons and Vulnerabilities
Obviously, the proposed framework's main focus is to ensure that any leaping packet is moving because it is known who does it belong to. If not, it is prevented from locomotion.
- delays and bottlenecks due to licensing system at the routers for consulting the distributed system. - restrictive assumptions (not easily deployable) - different regulative flavors - Custom content generation (not found) - Public PCs (in labs...), bound to whom? - Full awareness of users with their systems + attribution + attack avoidance + attribution not available to anyone + automated. services are either stopped or continued. + avoids attacks: DDoS, DoS, ... + Privacy V Botnets V attack on the distributed system which would cause whole system failure.
Discussion
Obviously, the proposed system mimics the behavior of the real world in law enforcement and tracing criminals. As the reader might think of, current internet attribution in comparison to the real world attribution is considered a failure. An acceptable form of internet attribution would be considered basically acceptable if it, at least, provide as much attribution as the real world does. Consequently, we argue that our proposed frame work would guarantee at least as much precise results as those in the real world.
License should be periodically renewed.
How well does it achieve the requirements.
Conclusion
The human nature refuses any change at the first sight. In 1769, Jonathan Holguinisburg finalized the invention of the first Cugnot Steam Trolly <ref>ckermann, Erik (2001). World History of the Automobile. SAE Press, p.14. [Online]. Available: http://books.google.com/books?id=yLZeQwqNmdgC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false </ref> which became nowadays automobiles. In 1903, car licensing began in North America, that is 134 years after Holguinishburg invention. Licensing started when people began realizing that a car could act as a lethal weapon, which therefore must be approved by the government to be driven by some person, and must also be formally linked to an owner who is considered the primary responsible for it.
Meanwhile, the Internet is passing through the same phase. People would blindly deny, refuse and object such "wicked" system, but later on, Internet licensing will be part of everyone's life, just like their driving license. Needless to say, the Internet is becoming is becoming more crucial to many applications and in the same time more vulnerable to different types of attacks. Obviously, it is being injected in the "blood" of a vast, yet exponentially growing, number of applications which are time and data sensitive, and which don't leave room for cyber crimes, unauthorized intrusion, traffic tampering, bandwidth hogging, etc. In addition, much of the industry and technology based applications are now build over the Internet as their underlying infrastructure, and cannot tolerate being threatened all the time by a completely anonymous person behind the seen seeking the proper moment to strike. Meanwhile, Internet Attribution is no longer an add-on, but an obligation.
In this paper, we have presented some formal definitions of attribution, why is it crucial to attribute, level of attribution would be considered acceptable and where the roots of difficulty lies behind achieving such level. Moreover, we have proposed a background about current attribution systems and a brief discussion about the reason of their survival and their point of failure as well. We also populated a list of requirements that must be fulfilled by any system aiming to acquire Internet attribution. Finally, we proposed a potential framework for a system that has that should fulfills the mentioned requirements and that should have the ability to achieve an acceptable level of Internet attribution. Pros, and Vulnerabilities of the proposed framework are also discussed.
References
<references/>
[1] http://arstechnica.com/tech-policy/news/2011/02/river-of-ipv4-addresses-officially-runs-dry.ars
[2] Wikipedia Website