DistOS-2011W Reputation

From Soma-notes

Members

  • Waheed Ahmed
  • Trevor Gelowsky
    • MSN: Gelowt@gmail.com
    • E-Mail: tgelowsk@sce.carleton.ca
  • Michael Du Plessis
  • Nicolas Lessard (nick.lessard @t gmail.com / nlessard @t carleton.connect.ca)

Our presentation

Our current presentation can be viewed at the following link: https://docs.google.com/present/edit?id=0ASS7kj9hfc1aZGRiMjMzOHJfNGhnNzhuamRr&hl=en&authkey=CMHi3KAD

Our Paper

Our final paper can be found here: Distributed OS: Winter 2011 Reputation Systems Paper

The problem

  • Emerge vs. Impose reputation on the system?
    • Probably both, how do we account for both systems?
  • Where do you store the data?
  • Where is the data queried from?
  • What defines good/bad reputation?
  • Who provides the good/bad reputation?
  • Who do we trust for this information?
  • Should reputation be mutable? Can we be pardoned, or can reputations be reversed?
  • What entities are able to contribute to reputations?
  • How do we access reputation about entities?
  • Who is authorized to access particular reputations? How much to reveal? (Information flow)

What technologies currently exist?

  • Digital signatures
    • Certificates signed by trusted organizations
  • Black hole- email, spam,
  • Google - search reputation
  • Credit bureaus
  • Yellow pages
  • Better business bureau
  • CRC - criminal records

What technologies don't currently exist?

Guaranteeing Authenticity/Public Key Infrastructure

In our paper we must explain why PKI/Authentication fits into reputation. Why must it be handled by both Attribution and Reputation systems?

Problem Domain

This portion of a reputation system answers the core question of how reputation information being exchanged is guaranteed to be authentic.

  • How do we ensure the information exchanged between peers authentic, and not tampered with?
  • How to we attribute information exchanged?
  • How do we do this in a highly decentralized, distributed system?
  • How can we make sure the information is timely?

Introduction

In past few years, Internet has provided platform for a global market place and both business and private users realizes that the revolutionary communications opportunities provided by it will give way to large spectrum of business and private applications.Today online users face multitude of problems and issues like vulnerability to viruses , worms , exposure to sniffers, spoofing their private sessions not only this but also they are also subjected to invasion of privacy with multitude of spy ware available for monitoring how they behave. Today over the internet different kind of activities take place ranging from access to information to entertainment, financial services, product services and even socializing. The frequent usage of internet as an important business tool led to a major increase in deliberate abuse and criminal activities. All the organization operating electronically and trading expose their own information and IT systems to a wide range of security threats. The most common protocols like IP/TCP/UDP are the main targets of potential hackers. Its all because of IPs on which attacks are possible and don't have proper authentication mechanism for any incoming data over internet.

In order to build secure chain of trust Public-Key Infrastructure is used for internet based communication. It consists of various things like security policy , Certificate authority , registration authority , certificate distribution system PKI enabled applications.

Uses and Need of PKI

With development of modern e-commerce based businesses which has minimal customer face-to-face interactions is demanding more security and integrity. The online web based stores where huge amount of transactions take place needs to ensure customers that there information is confidential and processed through a secure channel. This is where implementation of PKI steps in to provide mechanisms to ensure trusted relationships are established and maintained. The specific security functions in which a PKI can provide foundation are confidentiality, integrity, non-repudiation,and authentication.

Issues & Solutions

I found out there are many different implementations of PKI , and they all focuses on their own issues and solutions. For example PKI used in DoD have following issues

  • Lack of PKI-enabled eCommerce applications and lack of interoperability among PKI applications
  • DoD is developing a single high assurance PKI
  • Very High Cost Impact to the EC/EB community.
  • The PKI community lacks metrics for mapping of trust models between the DoD :”high assurance” C2 and EC/EB domains
  • Education of everyone (policy maker through user) to a common level of understanding is a huge challenge.
  • While the purpose of using PKI in EC/EB is to provide additional trust to allow the Internet to serve as a vehicle for legally binding transactions , problems still exist with the methodologies associated with establishing a long-term burden of proof. Specifically, there are no widely adopted industry standards for maintenance of electronic signatures or for authenticated timestamps for record maintenance that have stood the test of time. These processes are untried and the case law has not yet been established to convince users that there are no issues with enforcement of these new processes. An additional barrier to EC/EB within this space is the current DoD Certificate policy in which DoD accepts


Common Issues With PKI Implementation

  • Commercial Off-The-Shelf (COTS) versus Customised applications : The choice between COTS or customised products is usually one of cost versus usability. In case of usability the thing to be focused should be error messages. If PKI is built int o applications (transparent to users) than its fine if not than user will require to have some understanding of the use of keys, certificates, Certificate Revocation Lists (CRLs) and directories/certificate repositories so that they can make informed decisions.
  • Token Logistics (smart card): The point where keys and certificates are linked to their owner is a very critical point in a PKI. If a fraudulent certificate is issued by a registration officer and the certificate holder uses the certificate to commit a crime or prank, trust in the whole PKI hierarchy may be lost. The physical security requirements are high, and the registration officer, whether a person or a smartcard bureau, must be subject to strict security polices and practices. As it was problem with DoD mentioned in section above.
  • Network issues - Traffic : There is no doubt that the implementation of PKI will add to the network load, although just how much depends on the system architecture. Potential additional traffic that should be considered includes: Certificate issuance, Email usage, CRLs , and Directory Replication
  • Network issues - Encryption : Many organisations implement anti-virus software and content inspection on servers at the perimeter of their networks. Some have security policies that rejects or quarantines encrypted traffic. To provide user-to-user confidentiality, messages will traverse networks with their payload hidden from inspection by virus and content checking.
  • Email address in certificate :In order to use certificates for S/MIME signed/encrypted email, the users’ email address must be in the certificate. Most people change their email addresses more frequently than the certificate. Unless a solution is built which allows users to keep the same email address over a long period, certificates would have to be re-issued every time a user changes email address. S/MIME v.3 stipulates that the receiving application must check the From: or Sender: field in the mail header and compare it to an email address in the sender’s certificate. If the check does not match, the mail application should perform another explicit check to ensure that the person who signed the message is indeed the person who sent it. As usual, the ‘devil is in the detail’ when it comes to implementation.
  • Certificate Validity Checking:CRLs have been the conventional method of providing certificate validity checking. CRLs do not scale very well as discussed earlier, but are usually kept for backward compatibility, archiving/historical verification and for use in off-line mode. The other issue with CRLs is that they are generally issued at certain intervals of 6, 12 or 24 hours, causing a time lag from the time a certificate is revoked until it appears on the published CRL. This may present a security risk, as a certificate may verify correctly after it has been reported as compromised and revoked; (however some would argue that the time from actual compromise until the discovery and reporting of it would in most cases be a more significant lag). The Online Certificate Status Protocol (OCSP) (RFC2560) allows a client to query an OCSP responder for the current status of a certificate. This saves searching through a large CRL and can save bandwidth if the CRL would normally be downloaded - although it may increase network traffic. Most OCSP responders are based on CRLs and thus do not solve the problem of time lag as outlined above.
  • Availability and storage of reliable user information : For an identity certificate scheme, names in certificates need to be unique, meaningful - and correct. Few large user communities have all their member details in a central and accurate database or directory, and the exercise of consolidating, checking and updating all user data can turn into a massive and expensive exercise.
  • Archiving/historic verification : Digital signatures need to be verifiable even after the keys used to sign have expired. Likewise, we need to be able to verify that the certificate was valid at the time the datawas signed. This means we would need to archive: the signed file,the public key certificate of the signer, the CRL that was valid at the time of signing, a reliable timestamp to prove the accuracy of the time of signing and, the hardware environment that can run the software that was used at the time

What are the solution to these problems?

  • Identity Based Encryption:enables senders to encrypt messages for recipients w/o requiring that a recipients key first be established, certified, and published.
  • Certificate-based encryption:it incorporates IBE methods, but uses double encryption so that its CA cant decrypt on behalf of the user.
  • Certificateless Public Key Cryptography:it incorporates IBE methods, using partial private keys so PKG cant decrypt on behalf of the user.
  • Distributed Computation:There exists methods that distribute cryptographic operations so that the cooperative contribution of a number of entities is required in order to perform an operation such as a signature or decryption. It helps in tighter protection at servers vs clients, but implies that the users mist fully trust servers to apply keys appropriately.
  • Alternative Validation Strategies:Hash tree:it offers compact protected representations of the status of large number of certiticates.Highly valued if PKI is operated large scale more benifical than Certificate Revovation List. CRL reflect ststus information at fixed intervals.

Dissemination

The Problem Domain

Random Ramblings on Reputation Management and Distribution

Publish/Subscribe?

This system has unique distribution requirements as compared to most distributed systems in general. In this system, we cannot assume that there will be a universally agreed-upon definition of good, or bad. Similarly, the system must be self-policing. It would be up to each and every group of autonomous systems to decide which updates to accept and reject. Updates themselves also should not cause the network to DDoS itself. Lastly, it would be impossible for every system to know what the reputation for a given system is. Therefore the system must disseminate information in some way that is query-able and localizes reputation information where required.

To this end, we need a way of spreading information that while reliable, does not depend on one universally agreed-upon set of reputations.

For example, on an internet-scale operating system it would be entirely reasonable for one group of systems to not want to accept updates, or want to avoid communication with a given series of systems.

Any solution would assume that the problems of attribution are solved.

Current Examples of Reputation Dissemination

The first protocol that immediately comes to mind in this situation is a gossip-based protocol. These protocols are designed to operate in highly decentralized, large-scale systems.

Here's a nice overview:

Examples are as follows:

Another possibility is using "Reputation chains"

Maintaining History

Problem domain

  • Emerge vs. Impose reputation on the system?
    • Probably both, how do we account for both systems?
      • Do we maintain records based on a fixed set of imposed rules? Or do we build rules as the system emerges and reputations are formed?
  • Where do you store the data?
    • Distributed storage systems. Reputation in real-life is stored in interactions that an entity has with others. Reputation is not stored centrally. Reputation is most often a shared view of an entity by the masses, but sometimes an entities reputation can be disjoint among the masses: many different entities having differing views of reputation for the same entity.
  • Where is the data queried from?
    • (should I mention this?)
  • What defines good/bad reputation?
    • (should I mention this?)
  • Who provides the good/bad reputation?
    • Impose/Emerge problem: reputation for an interaction can be calculated immediately or can be a function of time.
  • Who do we trust for this information?
    • Trusting the masses is generally a good way of ensuring trustworthiness. Imposed rules will not always fit every situation well - could potentially set bad reputation to a "good" entity.
  • Should reputation be mutable? Can we be pardoned, or can reputations be reversed?
    • Do we maintain an ever-growing set of history items for interactions between entities? Do we look focus on the bad reputations?
  • What entities are able to contribute to reputations?
  • How do we access reputation about entities?
  • Who is authorized to access particular reputations? How much to reveal? (Information flow)
  • What assumptions will we make?
  • Privacy issues? What will we reveal? Will centralized systems have a know-all mentality?
    • Fine grained information will never be revealed (privacy concerns and user rights)
  • Which history should I maintain? What to take as important, what to disregard?
  • Immutable data structure. Who could add data? Who could remove data? Authority

Reputation systems

  • record, aggregate, distribute information about an entity's behaviour in distributed applications
  • reputation might be based on the entity's past ability to adhere to a license agreement (mutual contract between issuer and licensee)

History-based access control systems

  • make decision based on an entity's past security-sensitive actions

Examples of reputation systems (trust-informing technologies)

  • eBay - Feedback forum (positive, neutral, negative)

Do reputation systems have some validity?

Resnick et al. argue that reputation systems foster an incentive for principals to well-behave because of “the expectation of reciprocity or retaliation in future interactions

Abstractions are used to model the aggregated information of each entity. These abstractions may not encompass the full details of transactions and provide context to specific issues relating to feedback. In turn we can end up with ambiguous values.

So we need a system that provides sufficient information in order to verify the precise properties of a past behaviour.

  • Krukow, K. A Logical Framework for Reputation Systems and History-based Access Control. School of Electronics and Computer Science University of Southampton, UK. (March 3, 2011) [1]

Abstract

Reputation systems are meta systems that record, aggregate and distribute information about principals’ behaviour in distributed applications. Similarly, history-based access control systems make decisions based on programs’ past security-sensitive actions. While the applications are distinct, the two types of systems are fundamentally making decisions based on information about the past behaviour of an entity. A logical policy-centric framework for such behaviour-based decisionmaking is presented. In the framework, principals specify policies which state precise requirements on the past behaviour of other principals that must be fulfilled in order for interaction to take place. The framework consists of a formal model of behaviour, based on event structures; a declarative logical language for specifying properties of past behaviour; and efficient dynamic algorithms for checking whether a particular behaviour satisfies a property from the language. It is shown how the framework can be extended in several ways, most notably to encompass parameterized events and quantification over parameters. In an extended application, it is illustrated how the framework can be applied for dynamic history-based access control for safe execution of unknown and untrusted programs.

  • Khosrow-Pour, M. Emerging trends and challenges in information technology management (March 7, 2011) [2]

Abstract

  • Bolton, G. et al. How Effective are Electronic Reputation Mechanisms? (March 10, 2011) [3]

Abstract

Electronic reputation or “feedback” mechanisms aim to mitigate the moral hazard problems associated with exchange among strangers by providing the type of information available in more traditional close-knit groups, where members are frequently involved in one another’s dealings. In this paper, we compare trading in a market with electronic feedback (as implemented by many Internet markets) to a market without, as well as to a market in which the same people interact with one another repeatedly (partners market). We find that, while the feedback mechanism induces quite a substantial improvement in transaction efficiency, it also exhibits a kind of public goods problem in that, unlike the partners market, the benefits of trust and trustworthy behavior go to the whole community and are not completely internalized. We discuss the implications of this perspective for improving these systems.

This portion of a reputation system answers the core question of how reputation is generated from the information exchanged between systems and how/where it is stored.

Problem domain:

This portion of a reputation system answers the core question of how reputation is generated from the information exchanged between systems and how/where it is stored.

Problem domain:

• Emerge vs. Impose reputation on the system? • Probably both, how do we account for both systems? • Do we maintain records based on a fixed set of imposed rules? Or do we build rules as the system emerges and reputations are formed? • Where do you store the data? • Distributed storage systems. Reputation in real-life is stored in interactions that an entity has with others. Reputation is not stored centrally. Reputation is most often a shared view of an entity by the masses, but sometimes an entities reputation can be disjoint among the masses: many different entities having differing views of reputation for the same entity. • Who do we trust for this information? • Trusting the masses is generally a good way of ensuring trustworthiness. Imposed rules will not always fit every situation well - could potentially set bad reputation to a "good" entity. • Should reputation be mutable? Can we be pardoned, or can reputations be reversed? • Do we maintain an ever-growing set of history items for interactions between entities? Do we look focus on the bad reputations


Existing systems • Peer-based systems (emerge) • eBay - positive/negative rating system • Youtube - like/dislike/spam comment system • Policy-based systems (impose) • Java - policy based security • Android - policy based security These two systems are on opposite ends of the Emerge-Impose spectrum. EigenTrust system The EigenTrust system utilizes a numerical scale for reputation storage. Advantages: • Numerical values are easy to compare. • Little required storage space. Disadvantages: • Information is lost in the abstraction process. o No concrete data • Ambiguity Storing concrete data Hence, with the given information from the reputation system, we cannot generate an accurate profile of the entity. We need a system that represents reputation in a concrete form “If principal p gains access to resource r at time t, then the past behavior of p up until time t satisfies requirement ψr.” Advantages • A sufficient amount of information is available to come up with a profile of an entity Disadvantages • Storage space Shmatikov and Talcott Histories are sets of time-stamped events. Reputation is based on ability to adhere to licenses. Licenses might permit certain actions OR require certain actions from being performed. Advantages: • Store data in concrete form Disadvantages: • No notion of sessions (logically connected set of events) Representation of reputation If we consider reputation information to encompass the events and actions of an entity, then we can model reputation as a set of events. An interesting new problem is how to re-evaluate policies efficiently when new information becomes available. This problem is known as “dynamic model-checking”. Dynamic model-checking We want a way of summarizing past reputations. A solution here is to use: Havelund and Rosu, based on the technique of dynamic programming, used for runtime verification • Given some information on an entity, how do we convert/abstract this to reputation? • Is all the information necessary to maintain? • Now that we have the reputation information, what can we do with it? o What can we compare it with? Implementation Desired functionality of the system: new() - Append new reputation information update() - Update and summarize past behavior - This is a reduce function check() - Analyze whether the given reputation satisfies the criteria of the policy

Querying Reputation

Problems

  • Emerge vs. Impose reputation on the system?
    • Probably both, how do we account for both systems?
      • If you want to know someone's reputation, you either need to start asking around for it, imposing yourself. Or you need the data to be sent around, so you already have access to it; emergent.
  • Where do you store the data?
      • You need to know who has the data to ask them for it, or to go get it yourself.
  • Where is the data queried from?
      • First you need to know who's storing it. then you need to know if you're allowed to ask that node directly, do you ask a intermediary keeper of data. Will you even need to Query-- that is, do you already have all you need to know on hand? you need not get the latest updates on a node if every other node who's ever talked to it got DDOSed. (or do you?)
  • What defines good/bad reputation?
      • Should I make my own definition for bad reputation, and query if someone engaged in activities I consider bad, or should their be a global agreed upon reputation?
  • Who provides the good/bad reputation?
      • Who should I ask for information from?
  • Who do we trust for this information?
      • Whoever you trust, presumably their opinion on a given node is more important then a node you trust less.
  • Should reputation be mutable? Can we be pardoned, or can reputations be reversed?
      • topically, would you bother asking for 10 year old reputation data on a node, if it's been a model citizen for the last 9?
  • What entities are able to contribute to reputations?
      • Should I ask everyone I trust for an opinion on a given node, or just certain keepers of trust data?
  • How do we access reputation about entities?
      • You query someone in the know who you trust and are allowed to query.
      • you could say, ask everyone you know and trust, and ask them to ask people they know and trust, (and so on...if they're willing) until you find a node with the information you need.
      • in a more centralized system you need to ask some kind of keeper of information for the information you want, and that keeper may or may not provide you with the reputation info you want.
  • Who is authorized to access particular reputations? How much to reveal? (Information flow)
      • The ability to control this would depend on how centralized a system you have. In a truly distributed system where every node has an opinion on any other node they've talked to you'll be able to find somebody who can tell you about the CIA node, but in a more centralized system the keepers of information might be less...willing to give Joe 6 cores information on who Iran is DDOSing.

Maybe References

http://www.kirkarts.com/wiki/images/1/13/Resnick_eBay.pdf - Trust Among Strangers in Internet Transactions: Empirical Analysis of eBay’s Reputation System (maybe not too relevant)

http://portal.acm.org/citation.cfm?id=544741.544809 - An Evidential Model of Distributed Reputation Management

http://portal.acm.org/citation.cfm?id=775152.775242&type=series%EF%BF%BD%C3%9C -- The EigenTrust Algorithm for Reputation Management in P2P Networks

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.2297&rep=rep1&type=pdf -- A Robust Reputation System for Mobile Ad-hoc Networks

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.8729&rep=rep1&type=pdf -- EigenRep: Reputation Management in P2P Networks

http://www.chennaisunday.com/ieee%202010/Reputation%20Estimation%20and%20Query%20in%20Peer-to-Peer%20Networks.pdf -- Reputation Estimation and Query in Peer-to-Peer Networks

Here is another paper that might be interesting for you. -- Lester http://dcg.ethz.ch/publications/netecon06.pdf

Possible implementations

Implementation Requirements

Conclusion

References