DistOS 2014W Lecture 23: Difference between revisions

From Soma-notes
Nelaturuk (talk | contribs)
 
(17 intermediate revisions by 3 users not shown)
Line 1: Line 1:
===Presentations===
'''Presentations'''
===Mohammed===
===Distributed Shared Memory Systems - Mojgan===
===Sijo===
* Introduction to DSM systems
===Sandarbh===
* Advantages and Disadvantages
* Classification of DSM systems
* Design considerations
* Examples of DSM systems
- OpenSSI
- Mermaid
- MOSIX
- DDM
 
===Survey: Fault Tolerance in Distributed File System - Mohammed===
* Abstract
* Introductions
** About fault tolerance in any distributed system. Comparison between different file systems.
** Whats more suitable for Mobile based systems.
** Why satisfaction high for fault tolerance is one of the main issues for DFS's ? 
* Replication and fault tolerance
** What is the Replica and Placement policy? What is the synchronization? What is its benefit?
  - Synchronous Method
  - Asynchronous Method
  - Semi-Asynchronous Method
* Cache consistency and fault tolerance
** What is the cache? What is its benefit? Cache consistency?
  - Write only Read Many (WORM)
  - Transactional Locking - Read and write locks
  - Leasing
* Example DFS mentioned in the paper
** Google File Systems
** HDFS
** MOOSEFS
** iRODS
** GlusterFS
** Lustre
** Ceph
** PARADISE for mobile
* Conclusion
 
===Survey on Control Plane Frameworks for Software Defined Networking - Sijo===
* Introduction
** Traditional Networks - Control Plane and Forwarding Plane
** Software Defined Networking
- Proposes decoupling of layers into independent layers
- Network entities or nodes are specialized elements which does the forwarding
- Control applications works on the logical view of the network provided by the controller without having to worry about
  managing state distribution, topology discovery etc.
* Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability
* Controller Platforms
- Centralized and Distributed approaches
- Identify the need to use in controller platforms
- For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight
- For Distributed : ONIX - Hyperflow - YANC - ONOS
- Leverage parallel processing capabilities
* In detail about two systems:
** ONIX
** ONOS
* References
 
===Metadata management in Distributed File System - Sandarbh===
* What is metadata?
- Defined by bare-minimum functions for MDS (Metadata Server)
- Monitor the performance of DFS so that it can be used further
- Structure of metadata in Paper
* Why is Metadata management difficult?
- 50% of file operations are metadata operations
- Size of metadata
- Distribute the load evenly across all MDS
- Be able to handle thousands of clients
- Be able to handle file/directory permission change
- Recover data if some MDS goes down
- Be POSIX compliant
- Be able to scale- addition of new MDS shouldn't cause ripples
- Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access
* Static sub-tree partitioning
- Advantage - Clients know which MDS to contact for the file - Prefix caching
- Disadvantage - Directory hot spot formation
* Static hashing based partitioning
- Hash the filename or File identifier and assign it to MDS
- Advantage  - Distributes load evenly - Gets rid of hotpsot info
- Disadvantage
* "Don't ask me where your server is" approach
- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra
- Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically
* What's not in the slides
- Not focused on replication of metadata
- Semantic based search
* Structure of the survey
- Conventional metadata systems
- No-metadata approach
- Metadata approach of the file systems designed for specific goals 0  GFS, Haystack etcs
- Evolution history
- Comparison within category
- Cover reliability and consistency part
- Summarize learnings with expected trends
 
===Distributed Stream Processing - Ronak Chaudhari===
* About Stream processing
- Data streams
- DBMS vs Stream processing
* Applications
- Monitoring applications
- Militia applications
- Financial analysis
- Tracking applications
* Aurora
- Process incoming streams
- It has its own query algebra
- System Model - Query Model - Runtime Architecture
- QOS criteria
- SQuAL - Query algebra
- Aurora GUI
- Challenges in distribute operation
* Aurora vs Medusa
* Medusa
- Architecture
- Addition to Aurora - Lookup and Brain
- Failure detection
- Transfer of processing
- System API
- Load management
- High availability
- Benefits
* References

Latest revision as of 22:15, 23 April 2014

Presentations

Distributed Shared Memory Systems - Mojgan

  • Introduction to DSM systems
  • Advantages and Disadvantages
  • Classification of DSM systems
  • Design considerations
  • Examples of DSM systems
- OpenSSI
- Mermaid
- MOSIX
- DDM

Survey: Fault Tolerance in Distributed File System - Mohammed

  • Abstract
  • Introductions
    • About fault tolerance in any distributed system. Comparison between different file systems.
    • Whats more suitable for Mobile based systems.
    • Why satisfaction high for fault tolerance is one of the main issues for DFS's ?
  • Replication and fault tolerance
    • What is the Replica and Placement policy? What is the synchronization? What is its benefit?
  - Synchronous Method
  - Asynchronous Method
  - Semi-Asynchronous Method
  • Cache consistency and fault tolerance
    • What is the cache? What is its benefit? Cache consistency?
 - Write only Read Many (WORM)
 - Transactional Locking - Read and write locks
 - Leasing
  • Example DFS mentioned in the paper
    • Google File Systems
    • HDFS
    • MOOSEFS
    • iRODS
    • GlusterFS
    • Lustre
    • Ceph
    • PARADISE for mobile
  • Conclusion

Survey on Control Plane Frameworks for Software Defined Networking - Sijo

  • Introduction
    • Traditional Networks - Control Plane and Forwarding Plane
    • Software Defined Networking
- Proposes decoupling of layers into independent layers
- Network entities or nodes are specialized elements which does the forwarding 
- Control applications works on the logical view of the network provided by the controller without having to worry about 
  managing state distribution, topology discovery etc.
  • Theme, Argument Outline
- Need for using distributed systems design principles, tools in SDN controller design to achieve scalability and reliability 
  • Controller Platforms
- Centralized and Distributed approaches
- Identify the need to use in controller platforms
- For centralized it started with NOX - Maestro - Beacon - Floodlight - POX - OpenDayLight
- For Distributed : ONIX - Hyperflow - YANC - ONOS
- Leverage parallel processing capabilities
  • In detail about two systems:
    • ONIX
    • ONOS
  • References

Metadata management in Distributed File System - Sandarbh

  • What is metadata?

- Defined by bare-minimum functions for MDS (Metadata Server) - Monitor the performance of DFS so that it can be used further - Structure of metadata in Paper

  • Why is Metadata management difficult?

- 50% of file operations are metadata operations - Size of metadata - Distribute the load evenly across all MDS - Be able to handle thousands of clients - Be able to handle file/directory permission change - Recover data if some MDS goes down - Be POSIX compliant - Be able to scale- addition of new MDS shouldn't cause ripples - Contrasting goals - replication and consistency - Average case improvements vs guaranteed performance for each access

  • Static sub-tree partitioning

- Advantage - Clients know which MDS to contact for the file - Prefix caching - Disadvantage - Directory hot spot formation

  • Static hashing based partitioning

- Hash the filename or File identifier and assign it to MDS - Advantage - Distributes load evenly - Gets rid of hotpsot info - Disadvantage

  • "Don't ask me where your server is" approach

- Ex : Ceph , GlusterFS, OceanStore, Hierarchical Bloom filters, Cassandra - Responsibilities - Replica management, Consistency, Access control, Recover metadata in case of crash, Talk to each others to handle the load dynamically

  • What's not in the slides

- Not focused on replication of metadata - Semantic based search

  • Structure of the survey

- Conventional metadata systems - No-metadata approach - Metadata approach of the file systems designed for specific goals 0 GFS, Haystack etcs - Evolution history - Comparison within category - Cover reliability and consistency part - Summarize learnings with expected trends

Distributed Stream Processing - Ronak Chaudhari

  • About Stream processing

- Data streams - DBMS vs Stream processing

  • Applications

- Monitoring applications - Militia applications - Financial analysis - Tracking applications

  • Aurora

- Process incoming streams - It has its own query algebra - System Model - Query Model - Runtime Architecture - QOS criteria - SQuAL - Query algebra - Aurora GUI - Challenges in distribute operation

  • Aurora vs Medusa
  • Medusa

- Architecture - Addition to Aurora - Lookup and Brain - Failure detection - Transfer of processing - System API - Load management - High availability - Benefits

  • References