DistOS 2019F 2019-12-02
Readings
Borg, Omega, Kubernetes
- Schwarzkopf et al., "Omega: flexible, scalable schedulers for large compute clusters" (EuroSys 2013)
- Verma et al., "Large-scale cluster management at Google with Borg" (EuroSys 2015)
- Burns et al., "Borg, Omega, and Kubernetes: Lessons learned from three container-management systems over a decade" (ACM Queue, Jan/Feb 2016)
Discussion Questions
- How does the workload at Google differ from that of systems running HTCondor or YARN?
- What were the requirements for Omega?
- What do figures 12 and 13 in the Omega paper mean about the performance of Omega?
- What do figures 15 and 16 in the Omega paper mean regarding MapReduce performance?
- Borg depends on Paxos more than previous systems we have covered. What roles does Paxos have in Borg?
- Do you see any similarities between Borg and HTCondor?
- What are Borglets?
- What does compaction mean in the context of evaluating Borg?
- How are Borg jobs isolated inside Google? How about for GCE?
- How are cgroups important to Borg and Kubernetes? What problem do they address?
- Why is static linking important for containers?
- How are these systems "application oriented"? What does that mean?
- To what extent do Borg and Kubernetes look like distributed operating systems?