COMP 3000 Lab 2 2011: Difference between revisions

From Soma-notes
Saran (talk | contribs)
Saran (talk | contribs)
Line 147: Line 147:
# open() takes as input a file path and returns a file descriptor which can later be used to perform read() or write(). This is a stateful method, because the kernel has to keep track of every file descriptor's underlying I/O stream for every process. An alternative would be to have stateless (like HTTP) system calls for read() and write() which directly take file paths and data offset as input instead of file descriptors obviating the need for the kernel to store metadata for file accesses. The former method places the burden of maintaining state on the kernel, and the latter method places the burden of maintaining state, if needed, on the user.  
# open() takes as input a file path and returns a file descriptor which can later be used to perform read() or write(). This is a stateful method, because the kernel has to keep track of every file descriptor's underlying I/O stream for every process. An alternative would be to have stateless (like HTTP) system calls for read() and write() which directly take file paths and data offset as input instead of file descriptors obviating the need for the kernel to store metadata for file accesses. The former method places the burden of maintaining state on the kernel, and the latter method places the burden of maintaining state, if needed, on the user.  
# Dynamic library calls involve the execution of code from within the users process in the unprivileged user mode. System calls involve user/kernel mode switching and are executed by the privileged ring 0. However, note  that standard dynamic libraries provide wrappers for system calls and it is generally a good idea to use those instead of system calls directly.  
# Dynamic library calls involve the execution of code from within the users process in the unprivileged user mode. System calls involve user/kernel mode switching and are executed by the privileged ring 0. However, note  that standard dynamic libraries provide wrappers for system calls and it is generally a good idea to use those instead of system calls directly.  
# Programs that involve just computation, i.e memory and the CPU, run at near native efficiency. Unless there's a software emulation of the whole processor, a guest OS is just like a process on the host machine with preallocated memory and CPU. Shadow page tables ( http://lwn.net/Articles/216794/ ) are used to make memory accesses fast.  
# Programs that involve just computation, i.e memory and the CPU, run at near native efficiency. Unless there's a software emulation of the whole processor, a guest OS is just like a process on the host machine with preallocated memory and CPU. [http://lwn.net/Articles/216794/ Shadow Page Tables] are used to make memory accesses fast.  
# Programs on the Guest OS that use ethernet cards, accelerated graphics or hard disks directly run slowly because these devices are emulated by the host. A guest OS does not have access to real hardware. Techniques developed recently like IOMMU virtualization seek to overcome this problem.  
# Programs on the Guest OS that use ethernet cards, accelerated graphics or hard disks directly run slowly because these devices are emulated by the host. A guest OS does not have access to real hardware. Techniques developed recently like IOMMU virtualization seek to overcome this problem.  
# This question asks for two strategies employed by Virtual machine software ( a hypervisor ) to improve a guest's performance. One could be to modify the guest OS drivers to do I/O more efficiently ( for ex. avoiding the IDE/SATA interfaces ). A second method is to modify the CPU to reduce the cost of switching in and out of the hypervisor, i.e hardware virtualization. If you answered some methods for application performance optimization, you'll be graded based on that, owing to the lack of clarity of the question.  
# This question asks for two strategies employed by Virtual machine software ( a hypervisor ) to improve a guest's performance. One could be to modify the guest OS drivers to do I/O more efficiently ( for ex. avoiding the IDE/SATA interfaces ). A second method is to modify the CPU to reduce the cost of switching in and out of the hypervisor, i.e hardware virtualization. If you answered some methods for application performance optimization, you'll be graded based on that, owing to the lack of clarity of the question.  

Revision as of 21:32, 3 October 2011

Please answer using text files. Do not use doc, docx, pdf etc.

Part A (in Tutorial)

  1. Copy the ISO images from the class DVD to your computer (or get them from the COMP 3000 support folder, from here, or just download them from the Internet). Boot at least two of the ISOs in Virtualbox or VMware Player by creating a new virtual machine and using the ISO for the virtual CD drive. Where possible, start the "live CD", e.g., "Try Ubuntu". Note that some of the distributions may not work in some VMs (e.g., minix may crash on both VirtualBox and VMware Player). For at least two distributions:
    1. Roughly how long did the VM take to boot completely? (2 points - 1 mark for each .iso tested)
    2. How similar is the environment to that available on the Lambda SCS hosts? (Say in a few sentences.) (2 points - 1 mark for each .iso tested)
  2. Create an account on the class wiki. What username did you choose? (1 point)
  3. Look at /proc/cpuinfo in a Linux virtual machine (any distribution). Is the "guest" CPU the same as that reported by the Windows "host"? (You can find out system information in Windows by running "msinfo32".) (1 point)
  4. Examine the PCI devices as reported by the command line program "lspci" and identify the video card. Is this "virtual" video card the same as the "real" one that Windows uses? (1 point)
  5. Compile the program hello.c (below) with gcc -O hello.c -o hello-dyn and then run it using the command ltrace ./hello-dyn . What dynamic functions does the program call? (1 point)
  6. Compile the same program with gcc -O -static hello.c -o hello-static . Run this second binary with ltrace as before. What dynamic functions does the program now call? (1 point)
  7. Run strace on the static and dynamically compiled versions of hello. How many system calls do they each produce? (2 marks - 1 for static list, 1 for dynamic list)
  8. How can you make the output of hello.c go to the file "hello-output" by changing how it is invoked at the command line? (1 point)
  9. How can you make the output of hello.c go to the file "hello-output" by changing its code? (1 point)
/* hello.c */
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
       printf("Hello, world!\n");
       return 0;
}

Part B (on your own)

  1. Typically, what does a guest OS's hard disk look like to the host OS? (1 point)
  2. Do guest OSs need the same device drivers as host OSs? Why? (1 point)
  3. Why do we "open" files? Specifically, to what extent is it possible to do file I/O without open and close operations - assuming you could change the UNIX API? (2 point)
  4. What is the relationship between dynamic library calls and system calls? (1 point)
  5. What type of programs should run at near-native efficiency within a virtual machine? Why? (1 point)
  6. What type of programs should be significantly slower within a virtual machine? Why? (1 point)
  7. What are two strategies for accelerating programs that are slow within VMs? (2 point)
  8. What are "guest additions" for VMs? What are they for? (1 point)
  9. Compile and run the code (below) to compare their run times on the host and on the virtual machines. Both Linux and Windows code is provided - use appropriately. How much slower is a VM compared to a host machine. What if you run them both simultaneously? (3 points). Report both in absolute times and percentages, and mention what your host/guest OSes are and which Virtualization software was used.
  10. Bonus Question - Write code ( or modify the ones below ) to perform Disk I/O ( write to a file and read from it ) and report their performances on host and virtual machines. Show your code too. (5 points)
/* perf.c for linux */
#include <stdio.h>
#include <time.h>

#define TIMEVARS() clock_t start, stop; double sec
#define TIMESTART(msg) printf(msg "..."); fflush(stdout); start = clock()
#define TIMESTOP()   stop = clock(); \
sec = ((double)(stop-start))/CLOCKS_PER_SEC; \
printf("done (%.3f seconds)\n", sec)  

int main()
{
        int n1 = 12345,n2 = 188765;
        int i= 0 ,j=0,composite = 0;
        TIMEVARS();
        TIMESTART("Starting work");
        for(i = n1; i < n2; i++) {
                for(j = 2; j < i; j++) {
                        if (i%j == 0) {
                                composite++;
                                break;
                        }
                }
        }
        TIMESTOP();
        printf("%d primes found between %d and %d\n",n2-n1 - composite,n1,n2);
        return 0;

}
/* 
 * perf.c for windows - compile using Visual Studio C++ compiler 2010 
 * If this doesn't compile as is, feel free to modify it - the idea is to essentially get enough arithmetical and logical work done 
 * that consumes some measurable time ( the for loops ). 
*/

#include "stdafx.h"
#include <cstdio>
#include <ctime>

#define TIMEVARS() clock_t start, stop; double sec
#define TIMESTART(msg) printf(msg "..."); fflush(stdout); start = clock()
#define TIMESTOP()   stop = clock(); \
sec = ((double)(stop-start))/CLOCKS_PER_SEC; \
printf("done (%.3f seconds)\n", sec)

int _tmain(int argc, _TCHAR* argv[])
{
        int n1 = 12345,n2 = 188765;
        int i= 0 ,j=0,composite = 0;
        TIMEVARS();
        TIMESTART("Starting work");
        for(i = n1; i < n2; i++) {
                for(j = 2; j < i; j++) {
                        if (i%j == 0) {
                                composite++;
                                break;
                        }
                }
        }
        TIMESTOP();
        printf("%d primes found between %d and %d\n",n2-n1 - composite,n1,n2);
        return 0;

}

Answers

Part A

    1. (2 points - 1 mark for each o/s tested) Loading up a virtual machine of Ubuntu 11.04 within virtualbox or vmware should take approximately 2 minutes and 27 seconds.
    2. (2 points - 1 mark for each o/s tested) This answer depends on which distributions you have chosen. For Arch, freebsd: you are given a barebones distribution without a gui. For Fedora: similar gui background, For Knoppix: Textual based interface with audio readout.
  1. (1 mark) - provide username
  2. (1 mark) - CPU name should only vary slightly, provide proof of this. (yes)
  3. (1 mark) - video is dependant on the virtual machine software within the virtual machine & should be specified as such. (no)
  4. (1 mark) - libc_start_main() puts()
  5. (1 mark) - compiled statically will not call dynamic functions
  6. (2 marks - 1 for static list, 1 for dynamic list) - static - execve, uname, brk, brk, set_thread_area, brk, brk, fstat64, mmap2, write, exit_group (total number: 8 distinct functions) - dynamic - execve, brk, access, open, fstat64, mmap2, close, access, open, read, fstat64, mmap2, mprotect, mmap2, mmap2, close, mmap2, set_thread_area, mprotect, mprotect, mprotect, munmap, fstat64, mmap2, write, exit_group (total number: 13 distinct functions)
  7. (1 mark) - ./hello-dyn > hello-output or ./hello-static > hello-output
  8. (1 mark) - Program should include something similar to this:
/* hellotofile.c */
#include <stdio.h>
#include <unistd.h>
FILE *out;
char filename[] = "hello-output"; 

int main(int argc, char *argv[])
{ 
	out = fopen(filename,"w"); 
	if (out == NULL)
	{
		fprintf(stderr, "Couldn't open output file %s!\n", filename);
		return 1;	
	}
	else
	{
		fprintf(out, "Hello, world!\n");
		return 0; 
 	}
} 


Part B

  1. Guest OS's hard disk is just a file on the host operating system's file system. For example, VHD (Virtual Hard Disk), and VMDK (Virtual Machine Disk) are some popular file formats.
  2. Guest OSes do not require the same device drivers as the host OS, as they do not have access to read hardware directly. Virtualization software use Virtual Device Drivers to emulate hardware for guest OSes (link).
  3. open() takes as input a file path and returns a file descriptor which can later be used to perform read() or write(). This is a stateful method, because the kernel has to keep track of every file descriptor's underlying I/O stream for every process. An alternative would be to have stateless (like HTTP) system calls for read() and write() which directly take file paths and data offset as input instead of file descriptors obviating the need for the kernel to store metadata for file accesses. The former method places the burden of maintaining state on the kernel, and the latter method places the burden of maintaining state, if needed, on the user.
  4. Dynamic library calls involve the execution of code from within the users process in the unprivileged user mode. System calls involve user/kernel mode switching and are executed by the privileged ring 0. However, note that standard dynamic libraries provide wrappers for system calls and it is generally a good idea to use those instead of system calls directly.
  5. Programs that involve just computation, i.e memory and the CPU, run at near native efficiency. Unless there's a software emulation of the whole processor, a guest OS is just like a process on the host machine with preallocated memory and CPU. Shadow Page Tables are used to make memory accesses fast.
  6. Programs on the Guest OS that use ethernet cards, accelerated graphics or hard disks directly run slowly because these devices are emulated by the host. A guest OS does not have access to real hardware. Techniques developed recently like IOMMU virtualization seek to overcome this problem.
  7. This question asks for two strategies employed by Virtual machine software ( a hypervisor ) to improve a guest's performance. One could be to modify the guest OS drivers to do I/O more efficiently ( for ex. avoiding the IDE/SATA interfaces ). A second method is to modify the CPU to reduce the cost of switching in and out of the hypervisor, i.e hardware virtualization. If you answered some methods for application performance optimization, you'll be graded based on that, owing to the lack of clarity of the question.
  8. Guest Additions for VMs comprise device drivers and system applications installed on Guest OS to optimize their performance (ex. better Video support ) and usability ( ex. shared clipboard )
  9. I used Ubuntu 11.04 as host, and Arch as Guest on Qemu/KVM. Hardware is Core i7 with 4 cores, and 6 GB RAM. I chose n1 = 12345, n2 = 258765. It took 10.660 seconds to run on the host, and 10.670 seconds on the guest which is approx slower by an insignificant 0.1% (Your answers may vary). Simultaneously, the run times were similar because of a multi core machine.
  10. The same host/guest on the same hypervisor, and writing to the same partition on the hard disk, it took 16.080 seconds to write an 84 MB file on the host machine, and 22.560 seconds on the guest machine - a slowdown of about 40%. Your answers may vary but you should see a slowdown. A few things to note here :
    1. We don't want the CPU to take up time calculating stuff enabling the disk drive to write contents. We want to keep the drive busy at all times. You should disable the "modulo" operation if you're using the above code.
    2. fprintf() writes to a buffer, not to the underlying stream/media directly. fflush() has to be used to flush contents from the userspace. Further, the kernel may keep its own buffer. Use fsync() to ensure that it gets written to the disk. Without flushing the stream completely, both Host and Guest times were similar at approximately 1.2 seconds, which just goes to show you how much buffering takes place.


The objective of this question is to write a program that is relatively less efficient on a guest. You will receive full marks if your code shows that you've understood both the points mentioned above.


I used the code shown below :

/* io_perf.c for linux */
#include <stdio.h>
#include <time.h> 

#define TIMEVARS() clock_t start, stop; double sec
#define TIMESTART(msg) printf(msg "..."); fflush(stdout); start = clock()
#define TIMESTOP()   stop = clock(); \
sec = ((double)(stop-start))/CLOCKS_PER_SEC; \
printf("done (%.3f seconds)\n", sec)   

int main()
{
        int n1 = 12345,n2 = 10987654;
        int i= 0;
        FILE *a;
        a=fopen("tmp0001","w");
        TIMEVARS();
        TIMESTART("Starting work");
        for(i = n1; i < n2; i++) {
            fprintf(a,"%d ",i);
            fflush(a);
            fsync(a);  
        }
        TIMESTOP();
        fclose(a);
        return 0;
}