COMP 3000 Essay 2 2010 Question 11

From Soma-notes

Virtualize Everything But Time

Article written by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch. They are working for the Center for Ultra-Broadband Information Networks (CUBIN) Department of Electrical & Electronic Engineering at the University of Melbourne in Australia. Here is the link to the article: [1]


Background Concepts

To better understand this paper, it is very important to have a good understanding of the general concepts breached in it. For example, we all know what clocks are in our day-to-day life but what are they in the context of computing? In this section, we will describe concepts like timekeeping, hardware/software clocks, the advantages and disadvantages of the different available counters, synchronization algorithms and explains what is a para-virtualized system.

Timekeeping

Since thousands of years, men have tried to find better ways to keep track of time. From sundials to atomic clocks, they were all made for the specific purpose of measuring the passage of time. This is not so different in computer operating systems. It is typically done in one of two ways: tick counting and tickless timekeeping[2]. Tick counting is when the operating system sets up an hardware device, generally a CPU, to interrupt at a certain rate. So each time one of those interrupts are called(a tick), the operating system will keep track of it in a counter. That will tell the system how much time has passed. In tickless timekeeping, instead of the OS keeping track of time through interrupts, a hardware device is used instead starting its own counter when the system is booted. The OS just need to read the counter from it when needed. Tickless timekeeping seems to be the better way to keep track of time because it doesn’t hog the CPU with hardware interrupts however, its performance is very dependent on the type of hardware used. Another disadvantages is that they tend to drift and can cause inaccuracy. I will explains those drifts later. But both of these are just counters. They don’t know what is the actual real-time. To remedy that, either a computer gets its time from a battery-backed real-time clock or it queries a network time server(NTP) to get the current time. The computer can also use software in the form of a daemon that will run periodically to make adjustments to the time.

Clocks

Computer “clocks” or “timers” can be hardware based, software based or they can even be an hybrid. The most commonly found timer is the hardware timer. All of the hardware timers can be generally described by this diagram where some have either more or less features:

Diagram1. Timer Abstraction

This diagram nicely represent how tick counting works. The oscillator runs at a predetermine frequency. The operating system might have to mesure it when the system boots. The counter starts with a predetermined value which can be set by software. For every cycle of the oscillator, the counter counts down one unit. When it reaches zero, its generates an output signal that might interrupt the CPU. That same interrupt will then allow the counter’s initial value to be reloaded into the counter and the process begins again. As I said earlier, not all hardware timer works exactly like that. Some actually counts up, doesn't use interrupts or doesn't have an initial counter value but they still follow the same principle.

Timers

  1. PIT is useful for generating interrupts at regular intervals through its three channels. Channel 0 is bound to IRQ0 which interrupts the CPU at regular intervals. Channel 1 is specific to each system and Channel 2 is connected to the speaker system. As such, we only need to concern ourselves with Channel 0. [3]
  2. CMOS RTC, also known as a CMOS battery, allows the CMOS chip to remain powered to keep track of things like time even while the physical PC unit has no source of power. If there is no CMOS battery on the motherboard, the computer would reset to its default time each restart. The battery itself can die, as expected, if the computer is powered off and not used for a long period of time. This can cause issues with the main OS as well as the VM. [4]
  3. Local APIC handles all external interrupts for the processor in the system. It can also accept and generate inter-processor interrupts between Local APICs. [5]
  4. ACPI (I will fill these in later, Nov. 29th Update) --Spanke
  5. TSC
  6. HPET

Guest Timekeeping

Guest timekeeping is done using the same general methods as any computer timekeeping, using either tick counting or tickless systems. Where the two begin to differ, however, is that a host operating system is able to communicate directly with the physical hardware, while the guest operating system is unable to do so, having to communicate with the host system that it wants to communicate with the hardware. Having to do this is the greatest source of the guest operating system's clock losing accuracy, or more simply called drifting.

Sources of Drift

When a guest operating system is started, its clock simply synchronizes with the host's – some virtual machines such as VMware also do this when it is resumed from a suspended state, or restored from a snapshot – so it is easy to think that, since it starts off correctly the guest's clock will continue to be correct. That is, of course, incorrect. The first source of drift is simply due to the drift a host incurs in its own timekeeping. A clock is almost never entirely accurate, having a slight error due to the time used to communicate with the counter, even on the host system, and because the guest communicates with the host in order to keep track of its time, an error in the host's time is not only passed on to the guest, but because the host is trying to correct its own time the guest's request for a count is given slightly less priority, making it yet again lose accuracy. The larger the drift in the host, the larger the drift in the guest, as the host's drift simply compounds the issue.

Aside from the host's own drift, the other cause of drift in the virtual environment is the fact that the it is treated like a process by the host. In and of itself this doesn't seem like a problem, but because of it it can be denied the CPU time required, or allocated less memory than needed. With restricted CPU time, it's easy for the requested ticks or requested read of a counter to pile up and create a backlog of requests, or simply receive the requested data late enough to throw its clock off. With memory, if the virtual environment does not have enough allocated to it by the host it can run into the problem of swapping out pages that are needed soon. Swapping the pages back in will momentarily bring the entire virtual environment to a halt, so ticks are missed and the clock falls behind.

Impact of Drift

The impact of drift essentially boils down to round-off errors and lost ticks. The practical impact of drift, however, is quite apparent in any automated system. For a relatable real-world example, though not in a virtual environment, in a factory's assembly line, the machinery is finely tuned to do its own specific part at certain intervals, and it generally does so with impressive efficiency. If the clock in the system were to drift, however, a specific machine may move too soon or too late, bringing the line to a potentially catastrophic halt. In a virtual environment, drift is a bit more subtle, as one result of it could be skewed process scheduling – some schedulers give a certain amount of time to a process before moving on, but if the guest's time has drifted substantially, when it tries to correct its time it could give more or less time to the processes in the scheduler.

Compensation Strategies

There are a number of compensation strategies for dealing with drift, depending on the cause of it. If the problem is due to CPU management issues, then the host can give more CPU time to the virtual machine, or it can lower the timer interrupt rate – or simply use a tickless counter. If it is due to a memory management issue, allocating more memory to the virtual environment should prevent the system from needing to swap out page files so often.

If the issue is from neither of those, but simply due to the inevitable lag when the guest communicates with the hardware via the host, then there are other methods to correct the drift. Most systems natively have algorithms built in to correct the time if it gets too far ahead or behind real time, though they are not without their own faults; if the time is set ahead when catching up, the backlog of ticks it has built up may not be cleared, so it could potentially set itself ahead multiple times until the backlog is dealt with. Tools built into the virtual machine itself can also deal with drift to an extent, as VMware Tools does. This kind of tool checks to see if the clock's error is within a certain margin. If it exceeds the margin, then the backlog is set to zero – to prevent the issue mentioned with the native algorithms – and resynchronizes with the host clock before the guest goes back to keeping track of time as it normally would.

Research problem

Today, the use of the Network Time Protocol and of daemons like ntpd is the dominant solution for accurate timekeeping. In optimal conditions, the ntpd can be very good but these situations rarely happen. Network congestion, disconnections, lower quality networking hardware and unsuspected system events can create offsets errors in the order of 10‘s or even 100 milliseconds(ms). [6] For demanding applications, this is neither robust or reliable. One way to enhance the performance of ntpd would be to poll from the NTP server more often as this would reduced the offset error but unfortunately, this would increase the network traffic which could cause network congestions which would raise the offset error. So this won’t work.

Another problem with current system software clocks using NTP(like ntpd), is that they provide only an absolute clock.[7] So for applications that deals with network managements and measurements, this is unsuitable. Why? Because NTP focus on offset and not on hardware clock oscillator rate. For example, when calculating delay variations, the offset error doesn’t change anything to the calculations but the clocks’ oscillator rate variation does affect it. So having a more accurate timestamp would make those calculation more precise. Which mean we would need another system software clock.

In virtualization(in this case Xen), when migrating a running system from one system to another can cause issues and this is again caused by the ntpd daemon. By default, each guest OS runs its own instance of the ntpd daemon. So the synchronization algorithm keeps track of the reference wallclock time, rate-of-drift and current clock error, which are defined by the hardware clock on the system. So when migrating the virtualized OS to another system, the ntpd state is saved and when it is enabled again on the new system, thats where the problems starts. Because no two hardware clocks drifts the same way or have the exact same wallclock time, all the information traced by the daemon are all of a sudden inaccurate. This could prove disastrous to the system. This could go from a slowly recoverable error to one where ntpd might never recover, making the virtualized OS unstable.

Contribution

Critique

References

1. "Virtualize Everything But Time" by Timothy Broomhead, Laurence Cremean, Julien Ridoux and Darrel Veitch, 2010. http://www.usenix.org/events/osdi10/tech/full_papers/Broomhead.pdf

2. "Timekeeping in Virtual Machines, Information Guide" from VMWare. http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

3. "Bran's Kernel Development Tutorial" from Bona Fide OS Developer website. http://www.osdever.net/bkerndev/Docs/pit.htm

4.

7. "PC Based Precision Timing Without GPS" by Attila Pa ́sztor and Darryl Veitch, 2002. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/tscclock_final.pdf

8. "Robust Synchronization of Absolute and Difference Clocks over Networks" by Darryl Veitch, Julien Ridoux and Satish Babu Korada, 2009. http://www.cubinlab.ee.unimelb.edu.au/~darryl/Publications/synch_ToN.pdf