OceanStore & GPFS: Difference between revisions
No edit summary |
|||
(9 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
==Readings== | ==Readings== | ||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/oceanstore-sigplan.pdf | [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)] | ||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/ | |||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/ | [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)] | ||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/walker-xufs-worlds06.pdf | |||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/gpfs-fast02.pdf Frank Schmuck and Roger Haskin, "GPFS: A Shared-Disk File System for Large Computing Clusters" (2002)] | |||
[http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-25/walker-xufs-worlds06.pdf Edward Walker, "A Distributed File System for a Wide-Area High Performance Computing Infrastructure" (2006)] | |||
==Questions== | ==Questions== | ||
Is it worth it?? | |||
=Ocean Store= | |||
Pros | |||
*Only trust required is own box | |||
*Data is highly durable due to file versioning | |||
*Information divorced from location | |||
**So long as you can reliably obtain information, it doesn't matter where it is located | |||
*Applicable to many data storage situations, not for a specific case | |||
*Routing is decentralized | |||
*2/3 of network is up? All is available | |||
Cons | |||
*Very expensive to computer cryptography (slow generation of keys) | |||
*Utility models don't make economic sense, people prefer not to pay for access to their data | |||
=GPFS= | |||
Distributed local OS designed for clusters | |||
Max size of 4096TB | |||
Pros | |||
*Massively parallel - data is striped across many many disks | |||
*Therefor read/write is very fast | |||
*Option of redundancy | |||
*Locking mechanism | |||
**Two options | |||
***1. Data shipping | |||
****Distributed | |||
****First client to request access to file receives token | |||
****Other clients must request the current owner of the token | |||
*****The current owner of the file grants portional access to their file (breaks token and gives portion access) | |||
***2. Centralized locking | |||
****Faster in a small disk circumstance | |||
*Extreme reliability | |||
**Able to literally remove a hotswap disk and insert a blank one in its place, only to have the blank disk completely regenerate the missing data | |||
**Journalling to record token ownership - helps recovery when node in possession dies | |||
Cons | |||
*Everything must be trusted! Designed for clusters, not across LAN/WAN | |||
*Not appropriate for distributed networks. | |||
=XUFS= | |||
*User-space implementation | |||
*Designed to be simple | |||
*Very generic |
Latest revision as of 20:52, 25 February 2008
Readings
John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)
Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)
Questions
Is it worth it??
Ocean Store
Pros
- Only trust required is own box
- Data is highly durable due to file versioning
- Information divorced from location
- So long as you can reliably obtain information, it doesn't matter where it is located
- Applicable to many data storage situations, not for a specific case
- Routing is decentralized
- 2/3 of network is up? All is available
Cons
- Very expensive to computer cryptography (slow generation of keys)
- Utility models don't make economic sense, people prefer not to pay for access to their data
GPFS
Distributed local OS designed for clusters Max size of 4096TB
Pros
- Massively parallel - data is striped across many many disks
- Therefor read/write is very fast
- Option of redundancy
- Locking mechanism
- Two options
- 1. Data shipping
- Distributed
- First client to request access to file receives token
- Other clients must request the current owner of the token
- The current owner of the file grants portional access to their file (breaks token and gives portion access)
- 2. Centralized locking
- Faster in a small disk circumstance
- 1. Data shipping
- Two options
- Extreme reliability
- Able to literally remove a hotswap disk and insert a blank one in its place, only to have the blank disk completely regenerate the missing data
- Journalling to record token ownership - helps recovery when node in possession dies
Cons
- Everything must be trusted! Designed for clusters, not across LAN/WAN
- Not appropriate for distributed networks.
XUFS
- User-space implementation
- Designed to be simple
- Very generic