Difference between revisions of "DistOS 2014W Lecture 23"

From Soma-notes
Jump to navigation Jump to search
 
(11 intermediate revisions by 3 users not shown)
Line 14: Line 14:
* Abstract
* Abstract
* Introductions
* Introductions
** About fault tolerance in any distributed system. Comparison between different file systems. Whats more suitable for Mobile based systems. Why satisfaction high for fault tolerance is one of the main issues for DFS's ?   
** About fault tolerance in any distributed system. Comparison between different file systems.  
 
** Whats more suitable for Mobile based systems.  
** Why satisfaction high for fault tolerance is one of the main issues for DFS's ?   
* Replication and fault tolerance
* Replication and fault tolerance
** What is the Replica and Placement policy? What is the synchronization? What is its benefit?
  - Synchronous Method
  - Asynchronous Method
  - Semi-Asynchronous Method
* Cache consistency and fault tolerance
* Cache consistency and fault tolerance
** What is the cache? What is its benefit? Cache consistency?
  - Write only Read Many (WORM)
  - Transactional Locking - Read and write locks
  - Leasing
* Example DFS mentioned in the paper
** Google File Systems
** HDFS
** MOOSEFS
** iRODS
** GlusterFS
** Lustre
** Ceph
** PARADISE for mobile
* Conclusion
* Conclusion


===Sijo===
===Survey on Control Plane Frameworks for Software Defined Networking - Sijo===
===Sandarbh===
* Introduction
===Ronak Chaudhari===
** Traditional Networks - Control Plane and Forwarding Plane
** Software Defined Networking
- Proposes decoupling of layers into independent layers
- Network entities or nodes are specialized elements which does the forwarding
- Control applications works on the logical view of the network provided by the controller without having to worry about
  managing state distribution, topology discovery etc.
* Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability
* Controller Platforms
- Centralized and Distributed approaches
- Identify the need to use in controller platforms
- For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight
- For Distributed : ONIX - Hyperflow - YANC - ONOS
- Leverage parallel processing capabilities
* In detail about two systems:
** ONIX
** ONOS
* References
 
===Metadata management in Distributed File System - Sandarbh===
* What is metadata?
- Defined by bare-minimum functions for MDS (Metadata Server)
- Monitor the performance of DFS so that it can be used further
- Structure of metadata in Paper
* Why is Metadata management difficult?
- 50% of file operations are metadata operations
- Size of metadata
- Distribute the load evenly across all MDS
- Be able to handle thousands of clients
- Be able to handle file/directory permission change
- Recover data if some MDS goes down
- Be POSIX compliant
- Be able to scale- addition of new MDS shouldn't cause ripples
- Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access
* Static sub-tree partitioning
- Advantage - Clients know which MDS to contact for the file - Prefix caching
- Disadvantage - Directory hot spot formation
* Static hashing based partitioning
- Hash the filename or File identifier and assign it to MDS
- Advantage  - Distributes load evenly - Gets rid of hotpsot info
- Disadvantage
* "Don't ask me where your server is" approach
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra
- Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically
* What's not in the slides
- Not focused on replication of metadata
- Semantic based search
* Structure of the survey
- Conventional metadata systems
- No-metadata approach
- Metadata approach of the file systems designed for specific goals 0  GFS, Haystack etcs
- Evolution history
- Comparison within category
- Cover reliability and consistency part
- Summarize learnings with expected trends
 
===Distributed Stream Processing - Ronak Chaudhari===
* About Stream processing
- Data streams
- DBMS vs Stream processing
* Applications
- Monitoring applications
- Militia applications
- Financial analysis
- Tracking applications
* Aurora
- Process incoming streams
- It has its own query algebra
- System Model - Query Model - Runtime Architecture
- QOS criteria
- SQuAL - Query algebra
- Aurora GUI
- Challenges in distribute operation
* Aurora vs Medusa
* Medusa
- Architecture
- Addition to Aurora - Lookup and Brain
- Failure detection
- Transfer of processing
- System API
- Load management
- High availability
- Benefits
* References

Latest revision as of 18:15, 23 April 2014

Presentations

Distributed Shared Memory Systems - Mojgan

  • Introduction to DSM systems
  • Advantages and Disadvantages
  • Classification of DSM systems
  • Design considerations
  • Examples of DSM systems
- OpenSSI
- Mermaid
- MOSIX
- DDM

Survey: Fault Tolerance in Distributed File System - Mohammed

  • Abstract
  • Introductions
    • About fault tolerance in any distributed system. Comparison between different file systems.
    • Whats more suitable for Mobile based systems.
    • Why satisfaction high for fault tolerance is one of the main issues for DFS's ?
  • Replication and fault tolerance
    • What is the Replica and Placement policy? What is the synchronization? What is its benefit?
  - Synchronous Method
  - Asynchronous Method
  - Semi-Asynchronous Method
  • Cache consistency and fault tolerance
    • What is the cache? What is its benefit? Cache consistency?
 - Write only Read Many (WORM)
 - Transactional Locking - Read and write locks
 - Leasing
  • Example DFS mentioned in the paper
    • Google File Systems
    • HDFS
    • MOOSEFS
    • iRODS
    • GlusterFS
    • Lustre
    • Ceph
    • PARADISE for mobile
  • Conclusion

Survey on Control Plane Frameworks for Software Defined Networking - Sijo

  • Introduction
    • Traditional Networks - Control Plane and Forwarding Plane
    • Software Defined Networking
- Proposes decoupling of layers into independent layers
- Network entities or nodes are specialized elements which does the forwarding 
- Control applications works on the logical view of the network provided by the controller without having to worry about 
  managing state distribution, topology discovery etc.
  • Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability 
  • Controller Platforms
- Centralized and Distributed approaches
- Identify the need to use in controller platforms
- For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight
- For Distributed : ONIX - Hyperflow - YANC - ONOS
- Leverage parallel processing capabilities
  • In detail about two systems:
    • ONIX
    • ONOS
  • References

Metadata management in Distributed File System - Sandarbh

  • What is metadata?

- Defined by bare-minimum functions for MDS (Metadata Server) - Monitor the performance of DFS so that it can be used further - Structure of metadata in Paper

  • Why is Metadata management difficult?

- 50% of file operations are metadata operations - Size of metadata - Distribute the load evenly across all MDS - Be able to handle thousands of clients - Be able to handle file/directory permission change - Recover data if some MDS goes down - Be POSIX compliant - Be able to scale- addition of new MDS shouldn't cause ripples - Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access

  • Static sub-tree partitioning

- Advantage - Clients know which MDS to contact for the file - Prefix caching - Disadvantage - Directory hot spot formation

  • Static hashing based partitioning

- Hash the filename or File identifier and assign it to MDS - Advantage - Distributes load evenly - Gets rid of hotpsot info - Disadvantage

  • "Don't ask me where your server is" approach

- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra - Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically

  • What's not in the slides

- Not focused on replication of metadata - Semantic based search

  • Structure of the survey

- Conventional metadata systems - No-metadata approach - Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs - Evolution history - Comparison within category - Cover reliability and consistency part - Summarize learnings with expected trends

Distributed Stream Processing - Ronak Chaudhari

  • About Stream processing

- Data streams - DBMS vs Stream processing

  • Applications

- Monitoring applications - Militia applications - Financial analysis - Tracking applications

  • Aurora

- Process incoming streams - It has its own query algebra - System Model - Query Model - Runtime Architecture - QOS criteria - SQuAL - Query algebra - Aurora GUI - Challenges in distribute operation

  • Aurora vs Medusa
  • Medusa

- Architecture - Addition to Aurora - Lookup and Brain - Failure detection - Transfer of processing - System API - Load management - High availability - Benefits

  • References