COMP 3000 Essay 1 2010 Question 2

From Soma-notes

Question

How do the available system calls in modern versions of the Linux Kernel (2.6.30+) compare with the system calls available in the earliest versions of UNIX? How has the system call interface been expanded, and why? Focus on major changes or extensions in functionality.

Answer

A system call is a mean by which programs in the user space can access kernel services. Systems calls vary from operating system to operating system, although the underlying concepts tends to be the same. In general, a process is not supposed to be able to access the kernel directly. It can't access kernel memory and it can't call kernel functions. When the CPU prevents a process from accessing the kernel, this prevention is commonly known as, the protected mode. On the other hand, system calls are an exception to this rule. For example, older x86 processors used an interrupt mechanism to go from user-space to kernel-space, but newer processor (PentiumII+) provided instructions that optimize this transition using sysenter and sysexit instructions (Hayward, Mike. Intel P6 vs P7 system call performance. [1], December 9, 2002). All system calls are small programs built using the C programming language.

The Unix and Linux systems calls are roughly grouped into 6 major categories: file management, device management, information maintenance, process control, communications and miscellaneous calls. The miscellaneous calls are all the ones that don’t really fit in the other categories, like system calls dealing with errors. Today, the Unix and Linux operating system contains hundreds of system calls but in general, they all came from the 35 system calls that came with one of the original UNIX OS in the early 70s. In the next paragraphs, we’re going to describe the various system calls in each of the categories mentioned above, their evolution through history (major changes in functionality) and a comparison with the earliest versions of UNIX.


File Management Calls

The system calls in this group deal with every type of operation that is required to run a file system in the operating system. Create file, delete file, opening a file and closing a file are just a few examples of them and most of them hardly changed throughout the years.


chmod, chown and chdir have been available since the first original UNIX (1971) and they are still used in today’s Linux kernels. The chmod and chown calls allows the users to change the file attributes and implements security to the file system. The system call chdir allows the process to change the current working directory. In the 4th distribution of UNIX from Berkeley (4BSD), new system calls were added to give more control of the file system to the applications. The call chroot allows the process to change the current root directory with one specified in a path argument. fchmod and fchdir are the same as chmod and chdir except they takes file descriptors as arguments. As of Linux kernel 2.1.81, the chown system call now follows symbolic links and therefore, they introduced a new system call, lchown, that does not follow symbolic links.


Another four of the original UNIX system calls are open, creat, mkdir and close. The open and creat calls allows processes to open and possibly create a file or device. Arguments flags are used to set from access modes, like O_RDONLY(read-only), to status flags, like O_APPEND(append mode). The only modifications made to the system calls were the addition of status flags where some of them are linux-specific. The close call allows processes to close a file descriptor preventing it to be reused. No changes were made to it. mkdir allows the creation of a file directory. In the earliest version of Unix, to delete a directory, users needed to make a series of link and unlink system calls. With Unix 4.2BSD, rmdir was added and helped solve the problem. The rename call was also added in 4.2BSD allowing processes to change the name or the location of a file. As file system became more complex, these new system calls helped the users gain better control over them.


There is also system calls used to find, access and modify files. They are read, write, seek and stat. These were also part of the first UNIX built. The read and write system calls allows to read and write from a file (assigned to a file descriptor). The only change was in the Unix System V release 4(SVR4) where a write call could be interrupted at anytime. The seek system calls is used to go to a specified position in a file. This calls used a 16 bit address offset. But this was replaced very quickly for lseek as early as SVR4. It allows the call to use 32 bit address offsets enabling the users more flexibility when accessing or writing to files especially for large ones. It is still used in modern Linux and Unix systems. As of now, developers are trying to implement lseek64, a system call that will use 64 bit addresses. The stat system calls allows processes to get the status of a file. With SVR4, 2 other version of that system call were created: fstat and lstat. They both do the same thing except lstat give the status of symbolic links and fstat give the status of a file specified by a file descriptor. Different operating systems will output different values to represent the state of a file. Since kernel 2.5.48, the stat returned a nanoseconds field in the file’s timestamp. With the release of 4.4BSD, two new system calls called statvfs and fstatvfs were introduced to provide information about a mounted file system. They both do the same thing except fstatvfs takes file descriptors as an argument. These calls are only used in an UNIX environment. In Linux, it has statfs and fstatfs to support that same call.


The last two original UNIX system calls in this category that are still used today are link and unlink. link creates a hard link to an existing file and unlink deletes a file link’s name and possibly the file it refers to. If the name refers to a symbolic link, only the link is removed. No major changes were done to the unlink system calls but new system calls were create from link. The symlink system call was added in 4.2BSD to allow the creation of symbolic links in the file system.


In the Linux 2.6.16 build, multiple system calls were created so that the calls could deal with relative pathnames as arguments. They can easily be spotted as the system call names all finish with 'at'. Here is a sample list of the created system calls: openat, mkdirat, fchmodat, fchownat, fstatat, linkat, unlinkat, renameat and fchmodat.

Device Management Calls

The device management system calls are linked to hardware and they are mainly used to requests for devices, release devices, to logically attach a device or to detach it, get and modified device attributes and read and write to them.


Two of the most important system calls for the UNIX and Linux operating system is mount and umount. These were among the few system calls available in the first version of UNIX in 1971. The two calls allowed the operating system to load file systems on storage devices. A few changes were done to the mount system calls. Most of these changes were the creation of new mount flags to enhance performance. For example, since Linux 2.5.19, the MS_DIRSYNC flag permits the directory changes on a file system synchronous. Another Linux improvement was to provide per-process mount namespaces. This was added on the 2.4.19 kernel. If a process was created using the clone system call with the CLONE_NEWNS flag, the process will have a new namespace initialized to be a copy of the namespace of the process that was cloned. The umount system call unmounted the file system from the storage device. The only noteworthy change to umount was the creation of ‘‘umount2’’ in Linux 2.1.116. It is the same as umount except it allows different flags to control the operation.


The open, read and write calls can also be used to access devices. As discussed in the previous section, arguments flags are used to better control the device. You would use them as if the devices were files using the appropriate flags.


With the SVR4 came the system call mmap. This system call is used to map or unmap files or devices into memory. Once a device is mapped, the system call returns a pointer to the mapped area allowing processes to access that device. This system call is still used in a Unix environment but since Linux 2.4, Linux replaced it by the mmap2 system call. It is basically the same as mmap except for a final argument specifying the offset into a file in 4096-byte units. This enables the mapping of large files.


In version 7 of Unix, ioctl system call is used for device-specific operations that can’t be done using the standard system calls. This helps to deal with a multitude of devices. Each device drivers would provide a set of ioctl request code to allow various operations on their device. Each various request code are hardware dependent so there is no standard available for this system call.

Information Maintenance Calls

Information maintenance calls are system calls that return the computers personal information back to the user or change it completely. These type of calls can be split up into three groups get/set time or date, get/set system data and get/set process, file, or device attributes. To fully understand the difference between Linux and UNIX in regrades to system calls, one must explore the three sub-types of information maintenance calls and see how they have changed over time.

The first sub type is Get/set of time and/or date. In Linux, this can be done by a few different system calls, there are: 'gettimeofday' to get the time, 'settimeofday' to set it, 'time' returns the time in seconds and a few other ones like 'ftime'.In the earliest versions UNIX the used the system call was 'stime', which was used to interact with time and dates. 'stime' could return the time and date and sets the system’s idea of the time and date by altering the seconds. 'stime' is still being used by Linux because it is successful, unlike 'settimeofday', which was created to change timezones (tz_dsttime) as well as the time but each occurrence of this field call in the kernel source (apart from declaration) is a bug thus failing.

The second sub type is get/set system data. UNIX does this using the following commands: 'open', 'read', 'close', and 'write'. 'open' opens a file so the file can be written to or read from. 'read' retrieves data from the file, and 'write' modifies data in the file. 'close' is used to indicate that the file is no longer in use. Linux uses the same set of commands for the same purposes.In addition to those system calls there Linux has there own unique system calls which are: 'olduname' gets name and information about current kernel, similar to that is 'uname' gets name and information about current kernel (which is used in the newer versions of UNIX not the older ones), 'iopl' which changes I/O privilege level and 'sysfs' which gets file system type information.

The third sub type is get/set process, file, or device attributes, in UNIX there are several system calls for processing file and device attributes, some of these examples are common to both UNIX and Linux: 'stat' gets file status, 'fork' which spawns a new process, and 'stty' which sets the mode of typewriter.The 'wait' system call is used in both as well the only really difference is that in the Linux version wait store status information in a integer which take the integer itself as an argument, not a pointer to itself. In Linux there are a lot more system calls regarding this type and here are a few of them: 'capget' gets the capabilities of the process, 'capset' sets the capabilities process, 'getppid' gets process identification. The 'capget' and 'capset' interact with the raw kernel interface to getting and setting thread capabilities. These two system calls are specific to Linux and as such the use of these functions (in particular the format of the cap_user_*_t types) are updated as the kernel is updated. The 'getppid' returns the process ID of the calling process and never has any errors.

Process Control Calls

Process Control calls are system calls that handle the start, termination and other tasks that might be required for a process to run correctly.

In unix there are 10 system calls that make up Process Control Calls. These are: fork(),wait(),execl(),execlp(),execle(),execvp(),execv(),execve(),exit(),signal(),kill().


fork(): It takes a process and creates an identical processes, which in turn makes one the parent process and the other the child process. When fork() succeds it returns 0 to the child process and returns the PID of the child process to the parent process. When it fails, fork() returns -1 to the parent process.

wait(): This call makes a parent process wait for the child process to end. It returns the pid of the child process that is done. Wait fails if the process has no child process to wait for or its points to an invalid address.


execl(),execlp(),execle(),execvp(),execv(), are system calls based on the same principle that the system call takes as an argument a binary file and converts it into a process. When the system call works properly it does not return, instead it gives control to the new process which replaces the process that called the system call. each of these are called when different arguments are given.

The following are the definitions for the these system calls as described by this [2]

execl(): Takes the path name of an executable program (binary file) as its first argument. The rest of the arguments are a list of command line arguments to the new program (argv[]). The list terminated with a null pointer

execle(): Same as execl(), except that the end of the argument list is followed by a pointer to a null-terminated list of character pointers that is passed a the environment of the new program

execv(): Takes the path name of an executable program (binary file) as it first argument. The second argument is a pointer to a list of character pointers (like argv[]) that is passed as command line arguments to the new program.

execve(): Same as execv(), except that a third argument is given as a pointer to a list of character pointers (like argv[]) that is passed as the environment of the new program.

execlp(): Same as execl(), except that the program name doesn't have to be a full path name, and it can be a shell program instead of executable module. execvp(): Same as execv(), except that the program name doesn't have to be a full path name, and it can be a shell program instead of an executable module.


signal(): This system call is sent to the process when the proper conditions are met. When the program receives the signal it can act in three different ways. The first is to ignore completely, it wont matter how many times the signal is sent the process will not do anything because of it. The only signal that can't be ignored or caught is SIGKILL(). The second is to have the signal set to its default state which means when the process receives it, the process will end. The last option is to catch the signal, when this occurs the unix system will give control to a function that will execute according to how appropriate it is for the process.


kill(): The system sends a signal to the process when something occurs. It fails if the signal_name is not a correct signal. There is no process with the PID that matches the argument value.

exit(): This call ends the process that calls it and returns the exit status value.

In linux, all of these unix system calls have couterparts in linux except for the exec group of system calls, only execve exists. Also these system calls behave the same way in linux. However the system call signal() is not recommended to use because of its different implementations in different versions of linux and unix. It is better to use sigaction(). It changes the actions of the process when it receives a valid signal except SIGKILL and SIGSTOP. As newer versions of linux are released, these system calls will always never have major modifications but other system calls, based on these, may be created because specific cases which would make it easier to write programs.

Communications Calls

The communication calls relates to the concept of processes having the ability to communicate with one another. Similar to how humans use a telephone as their portal to communicate with eachother, communication calls use "pipes" as their gateway.


In unix there are four subgroups of system calls that are related to communications calls: pipelines, messages, semaphores, and shared memory. The following are the system calls that belong to each of the subgroups.


Pipelines: pipe() The pipe() command consists of two components. int pipe (file_descriptors) & int file_descriptors[2]. File_descriptors is an array consisting of two parts as well. One is for reading the data, and the other is for writing the data. Both writing and reading data will read in sequential order along with fully completing it's task. I.e.) There are no partial writes, the pipe will write the whole data that was sent and complete the transmission. The same concept holds for the reading where it will be read all the way through before reading another pipe or new information coming into the pipe. A specially named pipe is FIFO. Standing First In First Out. It is accessed as part of a file system through idea of pipes.


Messages: These functions all consist of recieving and sending messages from the queue usually involving ID's. msgget() acquires a message from the queue identifier relating to the key. Closely related, but not the same the msgrcv() command is used to recieve a message from the queue relating to the msqid parameter. This parameter involves the ID of where to recieve the message. msgsnd() sends a message to the queue. This command can be thought of as the reverse of the msgget(). Lastly, msgctl() performs message control operations through queries.


Semaphores: This idea of semaphores consists of setting it or checking it. They are used to control access to files. One can use the concept of file locking to get a better understanding of Semaphores. Semaphores aren't usually held together in singles, but rather in groups. This is done by creating a set that can contain several semaphores through the semget() command. semop() decides what we the semaphor to accomplish. I.e) depending on whether we have a positive, 0 or negative value, the semaphor shall be added, will wait, or be blocked until positive, respectively. Semaphores were first thought of by Dijkstra and used in computers in the late 60's


Shared Memory: Functions involving shared memory allow the user to be able to access, detach and combine shared addresses. shmget() command returns the ID for the shared memory region. It can also create it if it doesn't already exist. shmat() function attaches the shared memory to the virtual address of the calling process. shmdt() reverses the shmat() command, and detaches shared memory.


Unix and Linux use the same calls for the majority of the functions now, except for a few which are slighly different.

Miscellaneous System Calls

This category of system calls contains the system calls that do not enough similar calls to form its own group. To avoid random calls floating around, we simply group them into this category.


Directories: These are special files that contain a number of filenames. There are different variations of directories, i.e.) System V, Berkely style directories.


Time: Intuitively, this call allows the user to access the time of day. Specifics on time can be obtained through the structure given by these attributes: tm_secint, tm_min, tm_hour, tm_mday, tm_mon, tm_year, just to list a few. Parsing Input: Parsing is often used when the user enters in data and the program must parse this data into appropriate divisions in order to obtain specific parts of the data. I.e.)Seperate words from each other in the program, seperate numbers from characters. There are several different ways a programmer can parse the data in order to achieve specific pieces of the data that are needed to be analyzed.


Lastly, there are some system calls which overlap and can be considered in a specific category or mentioned within the Miscellaneous System Calls. Referring to the ``3rd edition, Modern Operating Systems`` textbook, the command, ``chmod`` described above in File Management Calls is considered Miscellaneous. Similarly, the kill() command is mentioned as a Miscellaneous System Call. Hence, it is difficult to decifer whether a system call can be placed into a specific category or simply placed in the ``Other`` bin.

Conclusion

System Calls have been an essential component to the structure of the Linux Kernel(2.6.30+) and Unix Operating Systems for a long period of time. They are the gateway between the user space and the kernel services. More specifically, it allows the User space to acquire the kernel services unlike processes which do not have this authority. Over the years of development in the Linux and Unix OS, the system calls have not had drastic changes to them. Rather than having radical changes to system calls, the development of system calls has merely added more specific system calls to solve new issues that occur within the OS. Hence, this concept has led the birth of 35 system calls to grow to an astonishing quantity consisting of hundreds of system calls. With hundreds of system calls available at ones disposable, all can be catgeorized into 6 major groups: file management, device management, information maintenance, process control, communications and miscellaneous calls. Operating Systems are a colossal program consisiting of very intrinsic pieces all coming together to form what we now know today as Linux Kernel(2.6.30+) or Unix. System calls are simply a small building block, but nevertheless an essential piece, to the tower that is our Operating System.

References

Salus, Peter H. A Quarter Century of Unix, Publisher: Addison-Wesly Professional, June 10, 1994.

Unix Programming Manual, http://cm.bell-labs.com/cm/cs/who/dmr/1stEdman.html, November 3, 1971.

BSD System Calls Manual. http://www.unix.com/man-page/FreeBSD/2/, The Unix and Linux Forums.

Linux Programmer's Manual, Linux man-pages project. http://www.kernel.org/doc/man-pages/

Mendonça Rato, Luís Miguel,professor,university of evora. http://www.di.uevora.pt/~lmr/syscalls.html