COMP 3000 Essay 2 2010 Question 11: Difference between revisions
No edit summary |
mNo edit summary |
||
Line 54: | Line 54: | ||
=References= | =References= | ||
1. | 1. Virtualize Everything But Time by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf | ||
2. Modern Operating System 3rd Edition, by Andrew S. Tanenbaum, published by Pearson. | 2. Timekeeping in Virtual Machines, Information Guide from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf | ||
3. Modern Operating System 3rd Edition, by Andrew S. Tanenbaum, published by Pearson. |
Revision as of 15:03, 1 December 2010
Virtualize Everything But Time
Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [1]
Background Concepts
To better understand this paper, it is very important to have a good understanding of the general concepts breached in it. For example, we all know what clocks are in our day-to-day life but what are they in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, the advantages and disadvantages of the different available counters, synchronization algorithms and explains what is a para-virtualized system.
Timekeeping
Since thousands of years, men have tried to find better ways to keep track of time. From sundials to atomic clocks, they were all made for the specific purpose of measuring the passage of time. This is not so different in computer operating systems. It is typically done in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up an hardware device, generally a CPU, to interrupt at a certain rate. So each time one of those interrupts are called(a tick), the operating system will keep track of it in a counter. That will tell the system how much time has passed. In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead starting its own counter when the system is booted. The OS just need to read the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts however, its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift and can cause inaccuracy. I will explains those drifts later. But both of these are just counters. They don’t know what is the actual real-time. To remedy that, either a computer gets its time from a battery-backed real-time clock or it queries a network time server(NTP) to get the current time. The computer can also use software in the form of a daemon that will run periodically to make adjustments to the time.
Clocks
Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:
Diagram1. Timer Abstraction
This diagram nicely represent how tick counting works[2]. The oscillator runs at a predetermine frequency. The operating system might have to mesure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. As I said earlier, not all hardware timer works exactly like that. Some actually counts up, doesn't use interrupts or doesn't have an initial counter value but they still follow the same principle.
Timers
- PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
- CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
- Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
- ACPI (I will fill these in later, Nov. 29th Update) --Spanke
- TSC
- HPET
Guest Timekeeping
Guest timekeeping is done using the same general methods as any computer timekeeping, using either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate with the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, or more simply called drifting.
Sources of Drift
When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when it is resumed from a suspended state, or restored from a snapshot – so it is easy to think that, since it starts off correctly the guest's clock will continue to be correct. That is, of course, incorrect. The first source of drift is simply due to the drift a host incurs in its own timekeeping. A clock is almost never entirely accurate, having a slight error due to the time used to communicate with the counter, even on the host system, and because the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.
Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that the it is treated like a process by the host. In and of itself this doesn't seem like a problem, but because of it it can be denied the CPU time required, or allocated less memory than needed. With restricted CPU time, it's easy for the requested ticks or requested read of a counter to pile up and create a backlog of requests, or simply receive the requested data late enough to throw its clock off. With memory, if the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.
Impact of Drift
The impact of drift essentially boils down to round-off errors and lost ticks. The practical impact of drift, however, is quite apparent in any automated system. For a relatable real-world example, though not in a virtual environment, in a factory's assembly line, the machinery is finely tuned to do its own specific part at certain intervals, and it generally does so with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt. In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.
Compensation Strategies
There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.
If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.
Research problem
Contribution
Critique
References
1. Virtualize Everything But Time by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf
2. Timekeeping in Virtual Machines, Information Guide from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf
3. Modern Operating System 3rd Edition, by Andrew S. Tanenbaum, published by Pearson.