DistOS 2014W Lecture 23: Difference between revisions
Line 61: | Line 61: | ||
===Metadata management in Distributed File System - Sandarbh=== | ===Metadata management in Distributed File System - Sandarbh=== | ||
* What is metadata? | * What is metadata? | ||
- | - Defined by bare-minimum functions for MDS (Metadata Server) | ||
- Monitor the performance of DFS so that it can be used further | - Monitor the performance of DFS so that it can be used further | ||
- Structure of metadata in Paper | - Structure of metadata in Paper | ||
* Why is Metadata management difficult? | * Why is Metadata management difficult? | ||
- 50% file operations are metadata operations | - 50% of file operations are metadata operations | ||
- Size of metadata | - Size of metadata | ||
- Distribute the load evenly across all MDS | - Distribute the load evenly across all MDS | ||
Line 72: | Line 72: | ||
- Recover data if some MDS goes down | - Recover data if some MDS goes down | ||
- Be POSIX compliant | - Be POSIX compliant | ||
- Be able to scale- addition of new MDS | - Be able to scale- addition of new MDS shouldn't cause ripples | ||
- Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access | - Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access | ||
* Static sub-tree partitioning | * Static sub-tree partitioning | ||
Line 78: | Line 78: | ||
- Disadvantage - Directory hot spot formation | - Disadvantage - Directory hot spot formation | ||
* Static hashing based partitioning | * Static hashing based partitioning | ||
- Hash the filename or File identifier and assign to MDS | - Hash the filename or File identifier and assign it to MDS | ||
- Advantage - Distributes load evenly - Gets rid of hotpsot info | - Advantage - Distributes load evenly - Gets rid of hotpsot info | ||
- Disadvantage | - Disadvantage | ||
* Don't ask me where your server is approach | * "Don't ask me where your server is" approach | ||
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra | - Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra | ||
- Responsibilities - Replica | - Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically | ||
* What's not in the slides | * What's not in the slides | ||
- Not focused on replication of metadata | - Not focused on replication of metadata | ||
Line 92: | Line 92: | ||
- Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs | - Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs | ||
- Evolution history | - Evolution history | ||
- Comparison | - Comparison within category | ||
- Cover reliability and consistency part | - Cover reliability and consistency part | ||
- Summarize learnings with expected trends | - Summarize learnings with expected trends |
Latest revision as of 22:15, 23 April 2014
Presentations
- Introduction to DSM systems
- Advantages and Disadvantages
- Classification of DSM systems
- Design considerations
- Examples of DSM systems
- OpenSSI - Mermaid - MOSIX - DDM
Survey: Fault Tolerance in Distributed File System - Mohammed
- Abstract
- Introductions
- About fault tolerance in any distributed system. Comparison between different file systems.
- Whats more suitable for Mobile based systems.
- Why satisfaction high for fault tolerance is one of the main issues for DFS's ?
- Replication and fault tolerance
- What is the Replica and Placement policy? What is the synchronization? What is its benefit?
- Synchronous Method - Asynchronous Method - Semi-Asynchronous Method
- Cache consistency and fault tolerance
- What is the cache? What is its benefit? Cache consistency?
- Write only Read Many (WORM) - Transactional Locking - Read and write locks - Leasing
- Example DFS mentioned in the paper
- Google File Systems
- HDFS
- MOOSEFS
- iRODS
- GlusterFS
- Lustre
- Ceph
- PARADISE for mobile
- Conclusion
Survey on Control Plane Frameworks for Software Defined Networking - Sijo
- Introduction
- Traditional Networks - Control Plane and Forwarding Plane
- Software Defined Networking
- Proposes decoupling of layers into independent layers - Network entities or nodes are specialized elements which does the forwarding - Control applications works on the logical view of the network provided by the controller without having to worry about managing state distribution, topology discovery etc.
- Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability
- Controller Platforms
- Centralized and Distributed approaches - Identify the need to use in controller platforms - For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight - For Distributed : ONIX - Hyperflow - YANC - ONOS - Leverage parallel processing capabilities
- In detail about two systems:
- ONIX
- ONOS
- References
Metadata management in Distributed File System - Sandarbh
- What is metadata?
- Defined by bare-minimum functions for MDS (Metadata Server) - Monitor the performance of DFS so that it can be used further - Structure of metadata in Paper
- Why is Metadata management difficult?
- 50% of file operations are metadata operations - Size of metadata - Distribute the load evenly across all MDS - Be able to handle thousands of clients - Be able to handle file/directory permission change - Recover data if some MDS goes down - Be POSIX compliant - Be able to scale- addition of new MDS shouldn't cause ripples - Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access
- Static sub-tree partitioning
- Advantage - Clients know which MDS to contact for the file - Prefix caching - Disadvantage - Directory hot spot formation
- Static hashing based partitioning
- Hash the filename or File identifier and assign it to MDS - Advantage - Distributes load evenly - Gets rid of hotpsot info - Disadvantage
- "Don't ask me where your server is" approach
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra - Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically
- What's not in the slides
- Not focused on replication of metadata - Semantic based search
- Structure of the survey
- Conventional metadata systems - No-metadata approach - Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs - Evolution history - Comparison within category - Cover reliability and consistency part - Summarize learnings with expected trends
Distributed Stream Processing - Ronak Chaudhari
- About Stream processing
- Data streams - DBMS vs Stream processing
- Applications
- Monitoring applications - Militia applications - Financial analysis - Tracking applications
- Aurora
- Process incoming streams - It has its own query algebra - System Model - Query Model - Runtime Architecture - QOS criteria - SQuAL - Query algebra - Aurora GUI - Challenges in distribute operation
- Aurora vs Medusa
- Medusa
- Architecture - Addition to Aurora - Lookup and Brain - Failure detection - Transfer of processing - System API - Load management - High availability - Benefits
- References