A link to the paper: Difference between revisions

From Soma-notes
Freetonik (talk | contribs)
 
(96 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=Title=
#REDIRECT [[Internet Attribution: Between Privacy and Cruciality]]
Proposed titles:
* Requirements for Attribution on the Internet
* Internet Attribution: Between Privacy and Cruciality
 
=Abstract=
Present and past situations show a need for improved attribution systems, and arguably, scientific basis for a properly functioning attribution systems are not yet defined. Lots of research have been focusing on attributing documents to authors for the sake of securing authorship rights and rapid identification of plagiarism. Many of those were revolving around the notion of using machine learning for linking articles to humans. Others proposed text classification and feature selection as a mean of detecting the author of document. Unfortunately, not that much research is addressing the problem of lack of robust attribution system over the internet. Authentication, as a mean of attribution, has proved its efficiency but, needless to say, it is not applicable to authenticate every single packet hopping over the intermediate systems. This paper presents limits and advances in the attribution of actions to agents over the internet. It reviews current attribution technologies as well as the limits of those technologies. It also identifies the requirements of a proper attribution system and proposes a distributed (yet cooperative) approach for performing attribution over the internet.
 
=Introduction=
Internet users prefer the partial anonymity while surfing the internet. Unfortunately, several internet users yet have bad intentions to exploit such anonymity in fulfilling different types of <i>electronic crimes</i> including: fraud, theft, forgery, impersonation, the distribution of malware (and hence, botnets), traffic tampering, DoS, bandwidth hogging, etc. Consequently, internet attribution is a highly sensitive field that constitutes a cornerstone position within internet security. Needless to say, current solutions don't guarantee efficient attribution nor are considered always applicable in most of the time, hence, current system suffers the lack of a relatively robust attribution mechanism. In the light of this context, we need better methodologies for reaching an acceptable success level for attributing actions to persons.
 
In principle, attribution can be defined as the mechanism of binding a system-defined act to an agent. An agent is typically an entity that has the ability to commit what constitutes an act. Within our focus, an agent could either be a person or a machine. It can also be defined as "determining the identity or location of an attacker or an attacker’s intermediary"<ref> [Institute for Defense Analyses, 2003</ref>. Problems like IP address spoofing, lack of interoperability in intermediate systems, dynamic nature of IP addresses, unawareness of system users with lots of <i>unknown</i> packets sneaking to their machines and poor efficiency of firewalls and IDSs make this determination operation considerably difficult. In addition, some types of attacks are carried out to conceal the real agent behind an act. For instance, malware distribution (and hence the creation of botnets), and stepping stones aim to inflict vagueness around the correct <i>human</i> source behind the scene.
 
In this paper, we focus on defining what it takes to achieve an acceptably working attribution mechanism over the internet. To do that, we review past research works in attribution and discuss their common limitations as well as flaws and what can be done in common to enhance such schemes. We also argue that the lack of a globally deployed registration system that registers system users and grant them LICENSED access to the system enfeebles proper attribution and motivates illegitimate intrusions and irregular behavior. We show that employing the mentioned system would reduce the incentive of irregular behavior as well as remove the blaze of tempting anonymity, putting attackers under the risk of being easily caught. We also discuss how privacy, as a counter force to attribution, plays a big role in the internet and within its users and propose a framework that achieves relatively robust attribution mechanism and retains the privacy of users.
 
Much of the research done in literature focuses on attribution that is done for keeping track of authorship, i.e., attributing text to authors. In this paper, we don't question the cruciality of attribution in this field, but rather we address a higher level of attribution of all possible actions to agents, which is sadly deemed slightly obsolete from the current research perspective.
 
This paper starts by a quick discussion on the dilemma of attribution, resolving the tension between attribution and privacy. Consequently, section 3 argues about the reasons behind the essentiality of implementing proper attribution systems. Section 4 presents a fundamental set of requirements for achieving an acceptable level of attribution over the internet as well as proposes an abstract framework for achieving attribution. In section 5, a review on the currently implemented systems that achieve attribution is presented as well as flaws and points of failure of the surveyed papers. In section 6, the reasons behind the difficulty of achieving a proper attribution system. And finally, a conclusion is presented in section 7.
 
==What is Attribution==
''The act of attributing, especially the act of establishing a particular person as the creator of a work of art.''<ref> The American Heritage® Dictionary of the English Language, Fourth Edition copyright ©2000 by Houghton Mifflin Company. Updated in 2009. Published by Houghton Mifflin Company.</ref>
 
We are concerned with one particular type of attribution - binding an act to a person. This may include intermediate attributions, for example, an act to an agent (software, device, etc.) and then attribution of an agent to person. Narrowing the problem further, we're only concerned about attribution in large, dynamic networks, like internet. For sake of simplicity, in this paper we're going to reference to "binding an act to a person on the internet" as "attribution", while other types of attributing will be defined separately.
 
=Background=
 
==Cookies==
 
==IP Addressing==
 
==Authentication Systems==
 
 
 
=The attribution dilemma=
 
Designing an attribution system is not a trivial task, because, regardless of technologies and/or infrastructure available, one needs to consider controversial question of balancing between strong attribution and privacy. This hypothetical line between attribution and privacy is not straight, and crucially depends on application. For instance, large financial institutions as well as its clients are interested in strong attribution system, which would solve many authorization and authentication problems, as well as will guarantee (to some degree) that agents of transactions are who they claim they are. On the other hand, political dissidents and whistle-blowers do exist primarily because there is no 100% effective attribution system in place and it is possible for them to distribute information (regardless of actual usefulness or goodness of it) and keep their identity secret. It is clear that single universal set of rules cannot satisfy these two cases. It is also clear that, in pretty abstract fashion, privacy is inversely proportional to attribution. While designing an attribution system one needs not only to decide on this ration for some particular case, but rather make this ratio dynamically changed depending on the case.
 
Assuming this ratio is found, another question is when to decide to use private information to track or punish a person, as to directly intrude their privacy? One might think that this question is a little bit out of the scope of our paper. This is true, however, these and a lot of less obviously related questions should be answered prior to designing, because in such an important thing as protection and privacy, designing of solution should not make too many assumptions and should guarantee something not only to operators of the system, but for users as well. In other words, even though system should be dynamic and adaptable to all potential use cases, it should remain universal to some extent and guarantee some law-related and moral principles.
 
(here go other questions. will show connection to requirements)
 
* While designing an attribution system one needs to consider balancing between attribution and privacy.
**Sometimes non-attribution is very crucial,to protect political dissidents and whistle-blowers
* When to decide to track a person and when not to (so as not to intrude privacy)?
* How to make sure attribution is properly achieved?
* Who should attribute who/what and why?
* How far can we trust IP-traceback, stepping stone authentications, link identifications and packet filtering in wedging packets to agents?
* How much can intermediate systems' cooperation contribute to achieving attribution?
* Should there be consequences upon attributing an action(s) to an agent? What are they? (punishment, rewarding, etc)
* How to deal with misleading data sources hiding behind botnets and concealing identities via stepping stones?
 
==Why do we need Attribution==
 
* For identifying purposes
** Web Banking
** eCommerce
** Web advertisements
 
* For better protection against cyber attacks:
** DoS and DDos
** Forgery and theft
** Sniffing private traffic
** Distributing illegal content/malware
** Sending spam
** Illegal/undesired intrusion
 
*For marketing purposes (privacy?)
** custom (client-based) content generation
 
==Why is it difficult to achieve attribution?==
 
The main problem I see is that the way Internet is designed makes it possible and relatively easy to act without compromising identity. Moreover, most current solutions are  based on the same structure and work within the same scope, thus, can only reduce the number of potentially destructive acts or just deal with the consequences.  Of course, no system can prevent 100% of destructive attempts, but some potentially good attribution system should make such attempts highly undesirable and "costly" for an attacker.
 
*The issue of lack of attribution on the web mostly arises whenever security is compromised. When you're bombarded with spam, or when a system is under a DoS attack attribution becomes a more appealing notion. Getting a balance between security and privacy is tricky, because once attacks are tracked so will all other traffic.
*Depending on the type of sender and receiver, different attribution policy will be requested.
 
In the ideal world, every action on the internet could be bound to a machine and thus to a person. This is done by examining the source IP printed on each moving packet, locating the geographical location of this IP, consulting the ISP covering the location and identifying the person. If an act requires strict attribution (like checking and sending emails), authentication is used. <b>Here is what goes wrong</b>:
* IP addresses can be <b>spoofed</b> and hence, misleads the geographical location.
* For avoiding that problem, <b>IP traceback</b> can be performed BUT it requires global cooperation of intermediate systems... it is not there!
* IPs are <b>not permanently bound</b> to personnel, so figuring out the person from the IP is not concrete.
* Network users are <b>not aware of all packets sneaking</b> to their machines, which allows for malware distribution and hence, the creation of botnets... misleading attribution!
* <b>Firewalls</b> and packet filters can be used for avoiding that problem, but they are not 100% efficient.
* It is not applicable to <b>authenticate</b> every single action on the internet.
 
===Attacks to prevent correct attribution of actions===
 
* Stepping stone attack: a common way of attributing attacks to anonymity by using multiple public random agents (as stepping stones) to reach the victim in order to conceal the attacking source. <ref name="ref1">S. Staniford-Chen and L. T. Heberlein. Holding intruders accountable on the internet. In SP ’95: Proceedings of the 1995 IEEE Symposium on Security and Privacy, page 39, Washington, DC, USA, 1995. IEEE Computer Society.</ref>
* Forgery
** Identity theft (impersonation)
** Distribution of malware
 
=Requirements for internet attribution system=
(semi-structured semi-draft)
 
We have decided on basic requirements for universal attribution system. Requirements are divided in three parts.
 
==General==
This part talks about most fundamental requirements, namely, an attribution system should attribute and do so within the law.
 
* Any potentially destructive act should be traceable to a person (and/or organization, group, etc)
* The system should not violate any current privacy-related laws and moral principles
 
 
==Deployment==
It is much easier to just design a system, it is much harder to design a system, deployment of which need not be instant and massive. Good attribution system should be designed in such a way that is allows the following:
 
* Attribution system should be incrementally deployable
* Attribution system should be adoptable to different set of rules and principles (laws of countries, organizations' policies, etc), yet remain universal
* Cost of setting up and maintaining the system for a particular body (person, organization, network) should be considerably less than average losses under current lack of attribution (e.g. DoS, identity theft, etc)
 
 
==Practice==
This part talks about the work of the system itself.
 
* Attribution mapping should not be a bijection, in other words action should map to persons, but not vice versa
* Traceability information should be distributed
* It should be impossible to collect all traceability data in one place
* Personal data should be stored by trusted authorities (e.g. governments)
* Traceability information and personal data should be separated, a connection to be revealed only when needed
 
=System Proposals=
 
 
 
=Conclusion=
 
=References=
<references/>

Latest revision as of 19:56, 11 April 2011