DistOS 2014W Lecture 15

From Soma-notes
Jump to navigation Jump to search

Designing Exercise

Can we do any kind of distributed system without crypto? We can't trust crypto...

What are the main features we need to consider for such a system ?

  • Limited Sharing
  • Integrity
  • Availability

Perhaps probabilistically...

Want to be able to put data in, have it distributed, and be able to get it out on some other machine. This kind of sharing would need identification or authentication process.

Availability: "distribute the crap out of it", doesn't need crypto. No corruption of data.

Integrity: hashing, but we assume hashes can be forged. If we want to know that we got the same file, then simply send each other the file and compare.

Big Takeaway

Everything you do with crypto is a refinement of what you can already do in weaker forms with weaker assumptions.


Note on Project Proposal

  • Date has been extended until next week. As Prof said some of the proposals are not completely up to mark.

Farsite

This paper describes Farsite, a serverless distributed file system that logically functions as a centralized file server but whose physical realization is dispersed among a network of untrusted desktop workstations. Farsite is intended to provide both the benefits of a central file server (a shared namespace, locationtransparent access, and reliable data storage) and the benefits of local desktop file systems (low cost, privacy from nosy sysadmins, and resistance to geographically localized faults).Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of file and directory data with a Byzantine-fault-tolerant protocol.It achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases.It requires no central administration to maintain.

Goal in designing Farsite is to harness the collective resources of loosely coupled, insecure, and unreliable machines to provide logically centralized, secure, and reliable file-storage service. Farsite system protects and preserves file data and directory metadata primarily through the techniques of cryptography and replication.

Farsite is not a high-speed parallel I/O system. Farsite manages trust using public-key-cryptographic certificates.

An important assumption they mentioned is files are both read by many users and also frequently updated by at least one user which is a disadvantage in Farsite.Two technology trends are fundamental in rendering Farsite's design practical:The large amount of unused disk capacity enables the use of replication for reliability, and the relatively low cost of strong cryptography enables distributed security. Every machine in Farsite may perform three roles: It is a client, a member of a directory group, and afile host. A client is a machine that directly interacts with a user. A directory group is a set of machines that collectively manage file information using a Byzantine-fault-tolerant protocol. Every member of the group stores a replica of the information, and as the group receives client requests, each member processes these requests deterministically, updates its replica, and sends replies to the client.When a client wishes to read a file, it sends a message to the directory group, which replies with the contents of the requested file. If the client updates the file, it sends the update to the directory group.

When a client wishes to read a file, it sends a message to the directory group, which replies with the contents of the requested file. Advantage of Farsite are (1)It adds local caching of file content on the client to improve read performance. (2)Farsite delays pushing updates to the directory group, because most file writes are deleted or overwritten shortly after they occur. (3)Performing encryption on a block level enables a client to write an individual block without having to rewrite the entire file. It also enables the client to read individual blocks without having to wait for the download of an entire file from a file host.

Farsite achieves reliability (long-term data persistence) and availability (immediate accessibility of file data when requested) mainly through replication.

The Farsite design uses two main mechanisms to keep a node's computation, communication, and storage from growing with the system size: hint-based pathname translation and delayed directory-change notification.