COMP 3000 Essay 2 2010 Question 9
Go to discussion for group members confirmation, general talk and paper discussions.
Paper
Authors:
- Muli Ben-Yehuday +
- Michael D. Day ++
- Zvi Dubitzky +
- Michael Factor +
- Nadav Har’El +
- Abel Gordon +
- Anthony Liguori ++
- Orit Wasserman +
- Ben-Ami Yassour +
Research labs:
+ IBM Research – Haifa
++ IBM Linux Technology Center
Website: http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf
Video presentation: http://www.usenix.org/multimedia/osdi10ben-yehuda [Note: username and password are required for entry]
Background Concepts
Before we delve into the details of our research paper, its essential that we provide some insight and background to the concepts and notions discussed by the authors.
Virtualization
In essence, virtualization is creating an emulation of the underlying hardware for a guest operating system, program or a process to operate on. [1] Usually referred to as a virtual machine, this emulation usually consists of a guest hypervisor and a virtualized environment, giving the guest operating system the illusion that its running on the bare hardware. But the reality is, we're running the virtual machine as an application on the host OS.
The term virtualization has become rather broad, associated with a number of areas where this technology is used like data virtualization, storage virtualization, mobile virtualization and network virtualization. For the purposes and context of our assigned paper, we shall focus our attention on hardware virtualization within the context of operating systems.
Hypervisor
Also referred to as VMM (Virtual machine monitor), is a software module that exists one level above the supervisor and runs directly on the bare hardware to monitor the execution and behaviour of the guest virtual machines. The main task of the hypervior is to provide an emulation of the underlying hardware (CPU, memory, I/O, drivers, etc.) to the guest virtual machines and to take care of the possible issues that may rise due to the interaction of those guests among one another, and with the host hardware and operating system. It also controls host resources.
Nested virtualization
The concept of recursively running one or more virtual machines inside one another. For instance, the main operating system (L1) runs a VM called L2. In turn, L2 runs another VM L3; L3 then runs L4 and so on.
Para-virtualization
A virtualization model that requires the guest OS kernel to be modified in order to have some direct access to the host hardware. In contrast to full-virtualization that we discussed in the beginning of the article, para-virtualization does not simulate the entire hardware, it rather relies on a software interface that we must implement in the guest so that it can have some privileged hardware access via special instructions called hypercalls. The advantage here is that we have less environment switches and interaction between the guest and host hypervisors, thus more efficiency. However, portability is an obvious issue, since a system can be para-virtualized to be compatible with only one hypervisor. Another thing to note is that some operating systems such as Windows does not allow para-virtualization.
Models of virtualization
Multiple-level architecture
Every hypervisor handles every other hypervisor running on top of it. For instance, if L0 (host hypervisor) runs L1. If L1 attempts to run L2, then the trap handling and the work needed to be done to allow L1 to instantiate a new VM is handled by L0. More generally, if L2 attempts to created its own VM, then L1 will handle the trap handling and such.
Single-level architecture
This model is tied into the concept of "Trap and emulate", where every hypervisor tries to emulate the underlying hardware (the VMX chip in the paper implementation) and presents a fake ground for the hypervisor running on top of it (the guest hypervisor) to operate on, letting it think that he's running on the actual hardware. The idea here is that in order for a guest hypervisor to operate and gain hardware-level privileges, it evokes a fault or a trap, this trap or fault is then handled or caught by the main host hypervisor and then inspected to see if its a legitimate or appropriate command or request, if it is, the host gives privilige to the guest, again having it think that its actually running on the main bare-metal hardware. In this model, everything must go back to the main host hypervisor. Then the host hypervisor forwards the trap and virtualization specification to the above-level involved or responsible. For instance, if L0 runs L1. Then L1 attempts to run L2. Then the command to run L2 goes down to L0 and then L0 forwards this command to L1 again. This is the model we're interested in because this what x86 machines basically follow. Look at figure 1 in the paper for a better understanding of this.
Trap and emulate model
A vitualization model based on the idea that when a guest hypervisor attempts to execute, gain or access privilged hardware context, it triggers a trap or a fault which gets handled or caught by the host hypervisor. The host hypervisor then determines whether this instruction should be allowed to execute or not. Then based on that, the host hypervisor provides an emulation of the requested outcome to the guest hypervisor. The x86 systems discussed in the Turtles Project research paper follows this model.
The uses of nested virtualization
Compatibility
A system could provide the user a compatibility mode for other operatng systems or applications. An example of this would be the Windows XP mode thats available in Windows 7, where Windows 7 runs Windows XP as a virtual machine.
Cloud computing
A cloud provider, more fomally referred to as Infrastructure-as-a-Service (IAAS) provider, could use nested virtualization to give the ability to customers to host their own preferred user-controlled hypervisors and run their virtual machines on the provider hardware. This way both sides can benefit, the provider can attract customers and the customer can have freedom implementing its system on the host hardware without worrying about compatibility issues.
The most well known example of an IAAS provider is Amazon Web Services (AWS). AWS presents a virtualized platform for other services and web sites to host their API and databases on Amazon's hardware.
Security
We can also use nested virtualization for security purposes. One common example is virtual honeypots. A honeypot is basically a hollow program or network that appears to be functioning to outside users, but in reality, its only there as a security tool to watch or trap hacker attacks. By using nested virtualization, we can create a honeypot of our system as virtual machines and see how our virtual system is being attacked or what kind of features are being exploited. We can take advantage of the fact that those virtual honeypots can easily be controlled, manipulated, destroyed or even restored.
Migration/Transfer of VMs
Nested virtualization can also be used in live migration or transfer of virtual machines in cases of upgrade or disaster recovery. Consider a scenarion where a number of virtual machines must be moved to a new hardware server for upgrade, instead of having to move each VM sepertaely, we can nest those virtual machines and their hypervisors to create one nested entity thats easier to deal with and more manageable. In the last couple of years, virtualization packages such as VMWare and VirtualBox have adapted this notion of live migration and developed their own embedded migration/transfer agents.
Testing
Using virtual machines is convenient for testing, evaluation and bechmarking purposes. Since a virtual machine is essentially a file on the host operating system, if corrupted or damaged, it can easily be removed, recreated or even restored since we can can create a snapshot of the running virtual machine.
Protection rings
Research problem
Rough version. Let me know of any comments/improvements that can be made on the talk page--Mbingham 19:51, 30 November 2010 (UTC)
Nested virtualization has been studied since the mid 1970s (see paper citations 21,22 and 36). Early reasearch in the area assumes that there is hardware support for nested virtualization. Actual implementations of nested virtualization, such as the z/VM hypervisor in the early 1990s, also required architectural support. Other solutions assume the hypervisors and operating systems being virtualized have been modified to be compatabile with nested virtualization. There have also recently been software based solutions (see citation 12), however these solutions suffer from significant performance problems.
The main barrier to having nested virtualization without architectural support is that, as you increase the levels of virtualization, the numer of control switches between different levels of hypervisors increases. A trap in a highly nested virtual machine first goes to the bottom level hypervisor, which can send it up to the second level hypervisor, which can in turn send it up (or back down), until it potentially in the worst case reaches the hypervisor that is one level below the virtual machine itself. The trap instruction can be bounced between different levels of hypervisor, which results in one trap instruction multiplying to many trap instructions.
Generally, solutions that requie architectural support and specialized software for the guest machines are not practically useful because this support does not always exist, such as on x86 processors. Solutions that do not require this suffer from significant performance costs because of how the number of traps expands as nesting depth increases. This paper presents a technique to reconcile the lack of hardware support on available hardware with efficiency. It solves the problem of a single nested trap expanding into many more trap instructions, which allows efficient virtualization without architectural support.
Contribution
What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)
The non stop evolution of computers entices intricate designs that are virtualized and harmonious with cloud computing. The paper contributes to this belief by allowing consumers and users to inject machines with their choice of hypervisor/OS combination that provides grounds for security and compatibility. The sophisticated abstractions presented in the paper such as shadow paging and isolation of a single OS resources authorize programmers for further development and ideas which use this infrastructure. For example the paper Accountable Virtual Machines wraps programs around a particular state VM which could most definitely be placed on a separate hypervisor for ideal isolation.
Critique
.. to be continued ..
The good
The bad
The style of paper
The paper presents an elaborate description of the concept of nested virtualization in a very specific manner. It does a good job to convey the technical details. Depending on the level of enlightenment towards the background knowledge it appears very complex and personally it required quite some research before my fully delving into the theory of the design. For instance the paragraph 4.1.2 "Impact of Multidimensional paging" attempts to illustrate the technique by an example with terms such as ETP and L1. All in all, the provided video highly in depth increased my awareness in the subject of nested hypervisors.
Conclusion
Bottom line, the research showed in the paper is the first to achieve efficient x86 nested-virtualization without altering the hardware, relying on software-only techniques and mechanisms. They also won the Jay Lepreau best paper award.
References
[1] Tanenbaum, Andrew (2007). Modern Operating Systems (3rd edition), page 569.