DistOS 2014W Lecture 23: Difference between revisions

From Soma-notes
 
Line 61: Line 61:
===Metadata management in Distributed File System - Sandarbh===
===Metadata management in Distributed File System - Sandarbh===
* What is metadata?  
* What is metadata?  
- Define by bare-minimum functions for MDS (Metadata Server)
- Defined by bare-minimum functions for MDS (Metadata Server)
- Monitor the performance of DFS so that it can be used further
- Monitor the performance of DFS so that it can be used further
- Structure of metadata in Paper
- Structure of metadata in Paper
* Why is Metadata management difficult?  
* Why is Metadata management difficult?  
- 50% file operations are metadata operations
- 50% of file operations are metadata operations
- Size of metadata
- Size of metadata
- Distribute the load evenly across all MDS
- Distribute the load evenly across all MDS
Line 72: Line 72:
- Recover data if some MDS goes down
- Recover data if some MDS goes down
- Be POSIX compliant
- Be POSIX compliant
- Be able to scale- addition of new MDS shoudn't cause ripples
- Be able to scale- addition of new MDS shouldn't cause ripples
- Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access
- Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access
* Static sub-tree partitioning
* Static sub-tree partitioning
Line 78: Line 78:
- Disadvantage - Directory hot spot formation
- Disadvantage - Directory hot spot formation
* Static hashing based partitioning  
* Static hashing based partitioning  
- Hash the filename or File identifier and assign to MDS
- Hash the filename or File identifier and assign it to MDS
- Advantage  - Distributes load evenly - Gets rid of hotpsot info
- Advantage  - Distributes load evenly - Gets rid of hotpsot info
- Disadvantage  
- Disadvantage  
* Don't ask me where your server is approach
* "Don't ask me where your server is" approach
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra
- Responsibilities - Replica mgmt, Consistency, Access control, Recover metadata in case of crash, Talk to each other to handle the load dynamically  
- Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically  
* What's not in the slides  
* What's not in the slides  
- Not focused on replication of metadata
- Not focused on replication of metadata
Line 92: Line 92:
- Metadata approach of the file systems designed for specific goals 0  GFS, Haystack etcs
- Metadata approach of the file systems designed for specific goals 0  GFS, Haystack etcs
- Evolution history
- Evolution history
- Comparison with in ctageory
- Comparison within category
- Cover reliability and consistency part
- Cover reliability and consistency part
- Summarize learnings with expected trends
- Summarize learnings with expected trends

Latest revision as of 22:15, 23 April 2014

Presentations

Distributed Shared Memory Systems - Mojgan

  • Introduction to DSM systems
  • Advantages and Disadvantages
  • Classification of DSM systems
  • Design considerations
  • Examples of DSM systems
- OpenSSI
- Mermaid
- MOSIX
- DDM

Survey: Fault Tolerance in Distributed File System - Mohammed

  • Abstract
  • Introductions
    • About fault tolerance in any distributed system. Comparison between different file systems.
    • Whats more suitable for Mobile based systems.
    • Why satisfaction high for fault tolerance is one of the main issues for DFS's ?
  • Replication and fault tolerance
    • What is the Replica and Placement policy? What is the synchronization? What is its benefit?
  - Synchronous Method
  - Asynchronous Method
  - Semi-Asynchronous Method
  • Cache consistency and fault tolerance
    • What is the cache? What is its benefit? Cache consistency?
 - Write only Read Many (WORM)
 - Transactional Locking - Read and write locks
 - Leasing
  • Example DFS mentioned in the paper
    • Google File Systems
    • HDFS
    • MOOSEFS
    • iRODS
    • GlusterFS
    • Lustre
    • Ceph
    • PARADISE for mobile
  • Conclusion

Survey on Control Plane Frameworks for Software Defined Networking - Sijo

  • Introduction
    • Traditional Networks - Control Plane and Forwarding Plane
    • Software Defined Networking
- Proposes decoupling of layers into independent layers
- Network entities or nodes are specialized elements which does the forwarding 
- Control applications works on the logical view of the network provided by the controller without having to worry about 
  managing state distribution, topology discovery etc.
  • Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability 
  • Controller Platforms
- Centralized and Distributed approaches
- Identify the need to use in controller platforms
- For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight
- For Distributed : ONIX - Hyperflow - YANC - ONOS
- Leverage parallel processing capabilities
  • In detail about two systems:
    • ONIX
    • ONOS
  • References

Metadata management in Distributed File System - Sandarbh

  • What is metadata?

- Defined by bare-minimum functions for MDS (Metadata Server) - Monitor the performance of DFS so that it can be used further - Structure of metadata in Paper

  • Why is Metadata management difficult?

- 50% of file operations are metadata operations - Size of metadata - Distribute the load evenly across all MDS - Be able to handle thousands of clients - Be able to handle file/directory permission change - Recover data if some MDS goes down - Be POSIX compliant - Be able to scale- addition of new MDS shouldn't cause ripples - Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access

  • Static sub-tree partitioning

- Advantage - Clients know which MDS to contact for the file - Prefix caching - Disadvantage - Directory hot spot formation

  • Static hashing based partitioning

- Hash the filename or File identifier and assign it to MDS - Advantage - Distributes load evenly - Gets rid of hotpsot info - Disadvantage

  • "Don't ask me where your server is" approach

- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra - Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically

  • What's not in the slides

- Not focused on replication of metadata - Semantic based search

  • Structure of the survey

- Conventional metadata systems - No-metadata approach - Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs - Evolution history - Comparison within category - Cover reliability and consistency part - Summarize learnings with expected trends

Distributed Stream Processing - Ronak Chaudhari

  • About Stream processing

- Data streams - DBMS vs Stream processing

  • Applications

- Monitoring applications - Militia applications - Financial analysis - Tracking applications

  • Aurora

- Process incoming streams - It has its own query algebra - System Model - Query Model - Runtime Architecture - QOS criteria - SQuAL - Query algebra - Aurora GUI - Challenges in distribute operation

  • Aurora vs Medusa
  • Medusa

- Architecture - Addition to Aurora - Lookup and Brain - Failure detection - Transfer of processing - System API - Load management - High availability - Benefits

  • References