Talk:COMP 3000 Essay 2 2010 Question 9

From Soma-notes
Jump to navigation Jump to search

Group members

  • Munther Hussain
  • Jonathon Slonosky
  • Michael Bingham
  • Chris Sullivan
  • Pawel Raubic

Group work

  • Background concepts: Munther Hussain
  • Research problem: Michael Bingham
  • Contribution:
  • Critique:


General discussion

Hey there, this is Munther. The prof said that we should be contacting each other to see whos still on board for the course. So please if you read this, add your name to the list of members above. You can my find my contact info in my profile page by clicking my signature. We shall talk about the details and how we will approach this in the next few days --Hesperus 16:41, 12 November 2010 (UTC)


Checked in -- JSlonosky


Pawel has already contacted us so he still in for the course, that makes 3 of us. The other three members, please drop in and add your name. We need to confirm the members today by 1:00 pm. --Hesperus 12:18, 15 November 2010 (UTC)


Checked in --Mbingham 15:08, 15 November 2010 (UTC)


Checked in --Smcilroy 17:03, 15 November 2010 (UTC)


To the person above me (Smcilroy): I can see that you're assigned to group 7 and not this one. So did the prof move you to this group or something ? We haven't confirmed or emailed the prof yet, I will wait until 1:00 pm. --Hesperus 17:22, 15 November 2010 (UTC)


Alright, so I just emailed the prof the list of members that have checked in so far (the names listed above plus Pawel Raubic), Smcilroy: I still don't know whether you're in this group or not, though I don't see your name listed in the group assignments on the course webpage. To the other members: if you're still interested in doing the course, please drop in here and add your name or even email me, you can find my contact info in my profile page(just click my signature).

Personally speaking, I find the topic of this article (The Turtle Project) to be quite interesting and approachable, in fact we've already been playing with VirtualBox and VMWare and such things, so we should be familiar with some of the concepts the article approaches like nested-virtualization, hypervisors, supervisors, etc, things that we even covered in class and we can in fact test on our machines. I've already started reading the article, hopefully tonight we'll start posting some basic ideas or concepts and talk about the article in general. I will be in tomorrow's tutorial session in the 4th floor in case some of you guys want to get to know one another. --Hesperus 18:43, 15 November 2010 (UTC)


Yeah, it looks pretty good to me. Unfortunately, I am attending Ozzy Osbourne on the 25th, so I'd like it if we could get ourselves organized early so I can get my part done and not letting it fall on you guys. Not that I would let that happen --JSlonosky 02:51, 16 November 2010 (UTC)


Why waste your money on that old man ? I'd love to see Halford though, I'm sure he'll do some classic Priest material, haven't checked the new record yet, but the cover looks awful, definitely the worst and most ridiculous cover of the year. Anyways, enough music talk. I think we should get it done at least on 24th, we should leave the last day to do the editing and stuff. I removed Smcilroy from the members list, I think he checked in here by mistake because I can see him in group 7. So far, we're 5, still missing one member. --Hesperus 05:36, 16 November 2010 (UTC)


Yeah that would be pretty sweet. I figured I might as well see him when I can; Since he is going to be dead soon. How is he not already? Alright well, the other member should show up soon, or I'd guess that we are a group of 5. --JSlonosky 16:37, 16 November 2010 (UTC)


Hey dudes. I think we need to get going here.. the paper is due in 4 days. I just did the paper intro section (provided the title, authors, research labs, links, etc.). I have read the paper twice so far and will be spending the whole day working on the background concepts and the research problem sections.

I'm still not sure on how we should divide the work and sections among the members, especially regarding the research contribution and critique, I mean those sections should not be based or written from the perspective of one person, we all need to work and discuss those paper concepts together.

If anyone wants to add something, then please add but don't edit or alter the already existing content. Lets try to get as many thoughts/ideas as possible and then we will edit and filter the redundancy later. And lets make sure that we add summary comments to our edits to make it easier to keep track of everything.

Also, we're still missing one member: Shawn Hansen. Its weird because on last Wednesday's lab, the prof told me that he attended the lab and signed his name, so he should still be in the course. --Hesperus 18:07, 21 November 2010 (UTC)


Yeah man. We really do need to get on this. Not going to ozzy so I got free time now. I am reading it again to refresh my memory of it and will put notes of what I think we can criticize about it and such. What kind of references do you think we will need? Similar papers etc? If you need to a hold of me. Best way is through email. jslonosk@connect.Carleton.ca. And if that is still in our group but doesn't participate, too bad for him--JSlonosky 14:42, 22 November 2010 (UTC)


The section on the related work has all the things we need to as far as other papers go. Also, I was able to find other research papers that are not mentioned in the paper. I will definitely be adding those paper by tonight. For the time being, I will handle the background concepts. I added a group work section below to keep track of whos doing what. I should get the background concept done hopefully by tonight. If anyone want to help with the other sections that would be great, please add your name to the section you want to handle below.

I added a general paper summary below just to illustrate the general idea behind each section. If anybody wants to add anything, feel free to do so. --Hesperus 18:55, 22 November 2010 (UTC)


I remember the prof mentioned the most important part of the paper is the Critique so we gotta focus on that altogether not just one person for sure.--Praubic 19:22, 22 November 2010 (UTC)


Yeah absloutely, I agree. But first, lets pin down the crucial points. And then we can discuss them collectively. If anyone happens to come across what he thinks is good or bad, then you can add it below to the good/bad points. Maybe the group work idea is bad, but I just thought maybe if we each member focuses on a specific part in the beginning, we can maybe have a better overall idea of what the paper is about. --Hesperus 19:42, 22 November 2010 (UTC)


Ok, another thing I figured is that the paper doesn't directly hint at why nested virtualization is necessary? I posted a link in references and I'l try to research more into the purpose of nested virtualization.--Praubic 19:45, 22 November 2010 (UTC)


Actually the paper does talk about that. Look at the first two paragraphs in the introduction section of the paper on page 1. But you're right, they don't really elaborate, I think its because its not the purpose or the aim of the paper in the first place. --Hesperus 20:31, 22 November 2010 (UTC)


The stuff that Michael provided are excellent. That was actually what I was planning on doing. I will start by defining virtualization, hypervisors, computer ring security, the need and uses of nested virtualization, the models, etc. --Hesperus 22:14, 22 November 2010 (UTC)


So here my question who doing what in the group work and where should I focus my attention to do my part?- Csulliva


I have posted few things regarding the background concepts on the main page. I will go back and edit it today and talk about other things like: nested virtualization, the need and advantages of NV, the models, the trap and emulate model of x86 machines, computer paging which is discussed in the paper, computer ring security which again they touch on at some point in the paper. I can easily move some of the things I wrote in the theory section to the main page, but I want to consult the prof first on some of those things.

One thing that I'm still unsure of is how far should we go here ? should we provide background on the hardware architecture used by the authors like the x86 family and the VMX chips, or maybe some of the concepts discussed later on in the testing such as optimization, emulation, para-virtualization ?

I will speak and consult the prof today after our lecture. If other members want to help, you guys can start with the related work and see how the content of the paper compares to previous or even current research papers. --Hesperus 08:08, 23 November 2010 (UTC)


In response to what Michael mentioned above in the background section: we should definitely talk about that, from what I understood, they apply the same model (the trap and emulate) but they provide optimizations and ways to increase the trap calls efficiency between the nested environments, so thats definitely a contribution, but its more of a performance optimization kind of contribution I guess, which is why I mentioned the optimizations in the contribution section below. --Hesperus 08:08, 23 November 2010 (UTC)


Ok, so for those who didn't attend today's lecture, the prof was nice enough to give us an extension for the paper, the due date now is Dec 2nd. And thats really good, given that some of those concepts require time to sort of formulate. I also asked the prof on the approach that we should follow in terms of presenting the material, and he mentioned that you need to provide enough information for each section to make your follow student understand what the paper is about without them having to actually read the paper or go through it in detail. He also mentioned the need to distill some of the details, if the paper spends a whole page explaining multi-dimensional paging, we should probably explain that in 2 small paragraphs or something.

Also, we should always cite resources. If the resource is a book, we should cite the page number as well. --Hesperus 15:16, 23 November 2010 (UTC)


Yeah I am really thankful he left us with another week to do it. I am sure we all have at least 3 projects due soon, other than this Essay. I'll type up the stuff that I had highlighted for Tuesday as a break tomorrow. I was going to do it yesterday but he gave us an extension, so I slacked off a bit. I also forgot :/ --JSlonosky 23:43, 24 November 2010 (UTC)


Hey dudes. I have posted the first part of the backgrounds concept here in the discussion and on the main page as well. This is just a rough version, so I will be constantly expanding it and adding resources later on today. I have also created and added a diagram for illustration, as far as I know, we should be allowed to do this. If anyone have any suggestions to what I have posted or any counter arguments, please discuss. I will also be moving some of the stuff I wrote here (the theory section) to the main page as well.

Regarding the critique, I guess the excessive amount of exits can somehow be seen as a scalability constraint, maybe making the overall design somehow too complex or difficult to get a hold of, I'm not sure about this, but just guessing from a general programming point of view. I will email the prof today, maybe he can give us some hints for what can be considered a weakness or a bad spot if you will in the paper.

Also, we're still missing the sixth member of the group: Shawn Hansen. --Hesperus 06:57, 29 November 2010 (UTC)


Hey guys. I can start working on the research problem part of the essay. I'll put it up here when I have a rough version than move it to the actual article. As for the critique section, how about we put a section on the talk page here and people can add in what they thought worked/didn't work with some explanation/references, and then we can get someone/some people to combine it and put it in the essay? --Mbingham 18:13, 29 November 2010 (UTC)


Yea really, great work on the Background. It's looking slick. I added some initial edit in the Contribution and Critique but I agree lets open a thread here and All collaborate. --Praubic 18:24, 30 November 2010 (UTC)


Nice man. Sorry I haven't updated with anything that I have done yet, but I'll have it up later today or tomorrow. I got both an Essay and game dev project done for tomorrow, so after 1 I will be free to work on this until it is time for 3004--JSlonosky 13:41, 30 November 2010 (UTC)


I put up an initial version of the research problem section in the article. Let me know what you guys think. --Mbingham 19:53, 30 November 2010 (UTC)


Hey guys. Since I'm working on the backgrounds concepts and Michael is handling the research problem. The other members should handle the contribution part. I think everything we need for the contribution section is in section 3 of the article (3.1, 3.2, 3.3, 3.4, 3.5). You can also make use of the things we posted here. Just to be on the safe side, we need to get this done by tomorrow's night. I'm working on a couple of definitions as we speak and will hopefully be done by tomorrow's morning.

PS: We should leave the critique to the end, there should not be a lot of writing for that part and we must all contribute.

--Hesperus 01:45, 1 December 2010 (UTC)


Just posted other bits that were missing in the backgrounds concepts section like the security uses, models of virtualization and para-virtualization. They're just a rough version however. I will edit them in the next few hours.I just need to write something for protection rings and that would be it I guess.

I can help with the other sections for the rest of the day, I will try to post some summaries for performance and implementation or even the related work. --Hesperus 07:26, 1 December 2010 (UTC)


Guys, we need to get moving here.. The contribution section still needs a lot. We need to talk about their innovations and the things they did there: CPU virtualization, Memory virtualization, I/O virtualization and the Macro-optimizations.

I will be posting something regarding this in the next few hours. --Hesperus 22:53, 1 December 2010 (UTC)


I have looked over the paper again and I am wondering about some things. How are we to critique it? By their methods, or by the paper itself? I find that in the organization of the paper, they give you the links and extra information to look more in depth on such things like the VMC technology, but they almost use that as an excuse for not explaining things in the paper. The VMC(0 ->1) annotation that isn't explained. I understand what they mean, but it seems that they assume that you already know some things. --JSlonosky 03:03, 2 December 2010 (UTC)


I think most research papers follow that kind of approach, they vaguely talk about the sideline things and provide references. The VMC technology from what I understood is just a creation of an environment to link or switch between hypervisors. --Hesperus 03:26, 2 December 2010 (UTC)


Paper summary

Background Concepts and Other Stuff

Virtualization

In essence, virtualization is creating an emulation of the underlying hardware for a guest operating system, program or a process to operate on. [1] Usually referred to as virtual machine, this emulation which includes a guest hypervisor and a virtualized environment, only gives an illusion to the guest virtual machine to make it think that its running directly on the main hardware. In other words, we can view this virtual machine as an application running on the host OS.

The term virtualization has become rather broad, associated with a number of areas where this technology is used like data virtualization, storage virtualization, mobile virtualization and network virtualization. For the purposes and context of our assigned paper, we shall focus our attention on hardware virtualization within operating systems environments.

Hypervisor

Also referred to as VMM (Virtual machine monitor), is a software module that exists one level above the supervisor and runs directly on the bare hardware to monitor the execution and behaviour of the guest virtual machines. The main task of the hypervior is to provide an emulation of the underlying hardware (CPU, memory, I/O, drivers, etc.) to the guest virtual machines and to take care of the possible issues that may rise due to the interaction of those guest virtual machines among one another, and the interaction with the host hardware and operating system. It also controls host resources.

Nested virtualization

Nested virtualization is the concept of recursively running one or more virtual machines inside one another. For instance, the main operating system (L1) runs a VM called L2, in turn, L2 runs another VM L3, L3 then runs L4 and so on.

Para-virtualization

[Coming....]


Trap and emulate model

A vitualization model based on the idea that when a guest hypervisor attempts to execute, gain or access privilged hardware context, it triggers a trap or a fault which gets handled or caught by the host hypervisor. The host hypervisor then determines whether this instruction should be allowed to execute or not. Then based on that, the host hypervisor provides an emulation of the requested outcome to the guest hypervisor. The x86 systems discussed in the Turtles Project research paper follows this model.

The uses of nested virtualization

Compatibility

A system could provide the user with a compatibility mode for other operatng systems or applications. An example of this would be the Windows XP mode thats available in Windows 7, where Windows 7 runs Windows XP as a virtual machine.

Cloud computing

A cloud provider, more fomally referred to as Infrastructure-as-a-Service (IAAS) provider, could use nested virtualization to give the ability to customers to host their own preferred user-controlled hypervisors and run their virtual machines on the provider hardware. This way both sides can benefit, the provider can attract customers and the customer can have freedom implementing its system on the host hardware without worrying about compatibility issues.

The most well known example of an IAAS provider is Amazon Web Services (AWS). AWS presents a virtualized platform for other services and web sites such as NetFlix to host their API and database on Amazon's hardware.

Security

[Coming...]

Migration/Transfer of VMs

Nested virtualization can also be used in live migration or transfer of virtual machines in cases of upgrade or disaster recovery. Consider a scenarion where a number of virtual machines must be moved to a new hardware server for upgrade, instead of having to move each VM sepertaely, we can nest those virtual machines and their hypervisors to create one nested entity thats easier to deal with and more manageable. In the last couple of years, virtualization packages such as VMWare and VirtualBox have adapted this notion of live migration and developed their own embedded migration/transfer agents.

Testing

Using virtual machines is convenient for testing, evaluation and bechmarking purposes. Since a virtual machine is essentially a file on the host operating system, if corrupted or damaged, it can easily be removed, recreated or even restored since we can can create a snapshot of the running virtual machine.

Protection rings

[Coming....]


EDIT: Just noticed that someone has put their name down to do the background concept stuff, so Munther feel free to use this as a starting point if you like.

The above looks good. I thought id maybe start touching on some of the sections, so let me know what you guys think. Heres what I think would be useful to go over in the Background Concepts section:

  • Firstly, nested virtualization. Why we use nested virtualization (paper gives example of XP inside win 7). Maybe going over the trap and emulate model of nested virtualization.
  • Some of the terminology of nested virtualization. The difference between guest/host hypervisors (we're already familiar with guest/host OSs), the terminology of L0, ..., Ln with L0 being the bottom hypervisor, etc
  • x86 nested virtualization limitations. Single level architecture, guest/host mode, VMX instructions and how to emulate them. Some of this is in section 3.2of the paper.

Again, anything else you guys think we should add would be great.

Commenting some more on the above summary, under the "main contributions" part, do you think we should count the nested VMX virtualization part as a contribution? If we have multiplexing memory and multiplexing I/O as a main contribution, it would seem to make sense to have multiplexing the CPU as well, especially within the limitations of the x86 architecture. Unless they are using someone else's technique for virtualizing these instructions.--Mbingham 21:16, 22 November 2010 (UTC)

Research problem

The paper provides a solution for Nested-virtualization on x86 based computers. Their approach is software-based, meaning that, they're not really altering the underlying architecture, and this is basically the most interesting thing about the paper, since x86 computers don't support nested-virtualization in terms of hardware, but apparently they were able to do it.


The goal of nested virtualization and multiple host hypervisors comes down to efficiency. Example: Virtualization on servers has been rapidly gaining popularity. The next evolution step is to extend a single level of memory management virtualization support to handle nested virtualization, which is critical for high performance. [1]

How does the concept apply to the quickly developing cloud computing?

Cloud user manages his own virtual machine directly through a hypervisor of choice. In addition it provides increased security by hypervicsor-level intrusion detection.

Related work

Comparisons with other related/similar research and work:

Refer to the following website and to the related work section in the paper regarding this section: http://www.spinics.net/lists/kvm/msg43940.html

[This is a forum post by one of the authors of our assigned paper where he talks about more recent research work on virtualization, particularly in his first paragraph, he refers to some more recent research by the VMWare technical support team. He also talks about some of the research papers referred to in our assigned paper.]


Theory (Section 3.1)

Apparently, theres 2 models to applying nested-virtualization:

  • Multiple-level architecture support: where every hypervisor handles every other hypervisor running on top of it. For instance, if L0 (host hypervisor) runs L1. If L1 attempts to run L2, then the trap handling and the work needed to be done to allow L1 to instantiate a new VM is handled by L0. More generally, if L2 attempts to created its own VM, then L1 will handle the trap handling and such.
  • Single-level architecture support: This is the model supported by the x86 machines. This model is tied into the concept of "Trap and emulate", where every hypervisor tries to emulate the underlying hardware (the VMX chip in the paper implementation) and presents a fake ground for the hypervisor running on top of it (the guest hypervisor) to operate on, letting it think that he's running on the actual hardware. The idea here is that in order for a guest hypervisor to operate and gain hardware-level privileges, it evokes a fault or a trap, this trap or fault is then handled or caught by the main host hypervisor and then inspected to see if its a legitimate or appropriate command or request, if it is, the host gives privilige to the guest, again having it think that its actually running on the main bare-metal hardware.

In this model, everything must go back to the main host hypervisor. Then the host hypervisor forwards the trap and virtualization specification to the above-level involved or responsible. For instance, if L0 runs L1. Then L1 attempts to run L2. Then the command to run L2 goes down to L0 and then L0 forwards this command to L1 again. This is the model we're interested in because this what x86 machines basically follow. Look at figure 1 in the paper for a better understanding of this.

Main contribution

The paper propose two new-developed techniques:

  • Multi-dimensional paging (for memory virtualization)
  • Multiple-level device management (for I/O virtualization)

Other contributions:

  • Micro-optimizations to improve performance.

Implementation

The turtle project has four components that is crucial to its implementation.

  • Nested VMX virtualization for nested CPU virtualization
  • Multi-dimensional paging for nested MMU virtualization
  • Multi-level device assignment for nested I/O virtualization
  • Micro-Optimizations to make it go faster

How does the Nest VMX virtualization work: L0(the lowest most hypervisor) runs L1 with VMCS0->1(virtual machine control structure).The VMCS is the fundamental data structure that hypervisor per pars, describing the virtual machine, which is passed along to the CPU to be executed. L1(also a hypervisor) prepares VMCS1->2 to run its own virtual machine which executes vmlaunch. vmlaunch will trap and L0 will have the handle the tape because L1 is running as a virtual machine do to the fact that L0 is using the architectural mod for a hypervisor. So in order to have multiplexing happen by making L2 run as a virtual machine of L1. So L0 merges VMCS's; VMCS0->1 merges with VMCS1->2 to become VMCS0->2(enabling L0 to run L2 directly). L0 will now launch a L2 which cause it to trap. L0 handles the trap itself or will forward it to L1 depending if it L1 virtual machines responsibility to handle. The way it handles a single L2 exit, L1 need to read and write to the VMCS disable interrupts which wouldn't normally be a problem but because it running in guest mode as a virtual machine all the operation trap leading to a signal high level L2 exit or L3 exit causes many exits(more exits less performance). Problem was corrected by making the single exit fast and reduced frequency of exits with multi-dimensional paging. In the end L1 or L0 base on the trap will finish handling it and resumes L2. this Process is repeated over again contentiously.

How Multi-dimensional paging work: The main idea with n = 2 nest virtualization there are three logical translations: L2 to Virtual to physical address, from an L2 physical to L1 physical and form a L1 physical to L0 physical address. 3 levels of translations however there is only 2 MMU page table in the Hardware that called EPT; which takes virtual to physical and guest physical to host physical. They compress the 3 translations onto the two tables going from the being to end in 2 hopes instead of 3. This is done by shadow page table for the virtual machine and shadow-on-EPT. The Shadow-on-EPT compress three logical translations to two pages. The EPT tables rarely changer were the guest page table changes frequently. L0 emulates EPT for L1 and it uses EPT0->1 and EPT1->2 to construct EPT0->2. this process results in less Exits.

How does I/O virtualization work: There are 3 fundamental way to virtual machine access the I/O, Device emulation(sugerman01), Para-virtualized drivers which know it on a driver(Barham03, Russell08) and Direct device assignment( evasseur04,Yassour08) which results in the best performance. to get the best performance they used a IOMMU for safe DMA bypass. With nested 3X3 options for I/O virtualization they had the many options but they used multi-level device assignment giving L2 guest direct access to L0 devices bypassing both L0 and L1. To do this they had to memory map I/O with program I/0 with DMA with interrupts. the idea with DMA is that each of hiperyzisor L0,L1 need to used a IOMMU to allow its virtual machine to access the device to bypass safety. There is only one plate for IOMMU so L0 need to emulates an IOMMU. then L0 compress the Multiple IOMMU into a single hardware IOMMU page table so that L2 programs the device directly. the Device DMA's are stored into L2 memory space directly


How they implement the Micro-Optimizations to make it go faster: The two main places where guest of a nested hypervisor is slower then the same guest running on a baremetal hypervisor are the second transition between L1 to L2 and the exit handling code running on the L1 hypervirsor. Since L1 and L2 are assumed to be unmodified required charges were found in L0 only. They Optimized the transitions between L1 and L2. This involves an exit to L0 and then an entry. In L0 the most time is spent in merging VMC's, so they optimize this by copying data between VMC's if its being modified and they carefully balance full copying versus partial copying and tracking. The VMCS are optimized further by copying multiple VMCS fields at once. Normally by intel's specification read or writes must be performed using vmread and vmwrite instruction (operate on a single field). VMCS data can be accessed without ill side-effects by bypassing vmread and vmwrite and copying multiple fields at once with large memory copies. (might not work on processors other than the ones they have tested).The main cause of this slowdown exit handling are additional exits caused by privileged instructions in the exit-handling code. vmread and vmwrite are used by the hypervisor to change the guest and host specification ( causing L1 exit multiple times while it handles a single l2 exit). By using AMD SVM the guest and host specifications can be read or written to directly using ordinary memory loads and stores( L0 does not intervene while L1 modifies L2 specifications).

Performance

Two benchmarks were used: kernbench - compiles the linux kernel multiple times. SPECjbb - designed to measure server side [perofmance for Java run-time environments

Overhead for nested virtualization with kernbench is 10.3% and 6.3% for Specjbb. There are two sources of overhead evident in nested virtualization. First, the transitions between L1 and L2 are slower than the transition in the lower level of the nested design (between L0 and L1). Second the code handling EXITs running on the host hypervisor such as L1 is much slower than the same code in L0.

The paper outlines optimization steps to achieve the minimal overhead.

1. Bypassing vmread and vmwrite instructions and directly accessing data under certain conditions. Removing the need to trap and emulate.

2. Optimizing exit handling code. (the main cause of the slowdown is provided by additional exits in the exit handling code.

Critique

The good:

  • From what I read so far, the research showed in the paper is probably the first to achieve efficent x86 nested-virtualization without altering the hardware, relying on software-only techniques and mechanisms. They also won the Jay Lepreau best paper award.
  • security - being able to run other hypervisors without being detected
  • testing, debugging - of hypervisors
  • Writing, organization wise: They provide links and resources that can help give explanations to the concepts that they briefly touch upon

The bad:

  • lots of exits. to be continued. (anyone whos is interested feel free to take this topic)
  • Writing, Organization wise: Some concepts, such as the VMC's, are written such that you should already be familiar with how they work, or read the appropriate references for that section of the research project

References

[1] http://www.haifux.org/lectures/225/ - Nested x86 Virtualization - Muli Ben-Yehuda