DistOS-2011W Globus

From Soma-notes

Matthew Chou

Introduction

The system that I have attempted to implement was the Globus Toolkit(1), and from my current knowledge and understanding it seems quite difficult to create software that can be used on a grid system. While searching for different distributed systems I have come across different systems that are utilized for various fields of research, such as BOINC(2), folding@home(3), and many other @home projects as well as Condor(4).(list can be found here(5)) Seeing as there are many of these types of systems being used around the world, I thought it would interesting to see what steps it would take to implement such a system and to see if I could get something to run on them.


Background

The basis of grid computing has been around for quite some time, starting with super computers which have many processors and vasts amounts of memory working under one machine. The idea behind a supercomputer, allowing the power of many machines to be run under one machine gives that one "single" machine a large threshold of computational power, and that itself is the idea that is to be implemented in grid computing. Multiple machines across various geographic distances working together to give power to any one machine that requires it. The thought of such a system seems like an obvious task to do, but with such a system it gives rise to different implementation issues that current operating systems have to deal with, such as heterogeneity, scalability, and adaptability(6). Heterogeneity refers to having standards of use and data that the grid can follow so that when other domains are part of the grid there is no problems with integration and usage of the grid. Scalability is necessary for when the grid size increases because of the scalability of application usage on the grid and the organization/management of jobs distributed across it. Adaptability is also necessary because at any given point a node on the grid may go down and the job it had been doing must be done by another node, so with increased scalability the grid can utilize multiple nodes to adapt to dropped nodes as well as malicious nodes who might attempt to do incorrect jobs. The job of the Globus Toolkit is being the middleware, which provides the services that allow for grid computing to work successfully. Other known middleware implementations are glite(7), and UNICORE(8).


The following paper will go over the steps that I took to install the Globus Toolkit;including details of the certification process from SimpleCA, implementing a "Hello, world" application on it, and discussion of my thoughts on the ease of installation but hard implementation of applications on a grid computing system.

Installation

Environment Setup

On my journey to installing the Globus Toolkit, I have decided upon installing the latest version available which is the Globus Toolkit 5.0.3.

Example alt text
Pre-requisite check

This grid implementation must be installed on a UNIX based OS which I chose to be Ubuntu 10.10 on a virtual machine operated under Oracle VM VirtualBox(9). The installation instructions I followed had a quickstart(10) installation tutorial and an Admin Guide tutorial(11) which I both read because some of the instructions were more simply explained in the quickstart tutorial than the Admin Guide. If I were to suggest how to set up one's environment, I would recommend following the Admin Guide and using the quickstart as a reference. The quickstart guide gives a list of required software that can be checked with a few simple commands in the terminal that check the versions/existence of openssl, libssl, zlib, gcc, g++, tar, sed, and make. There are specific instructions on each type of platform before installing the toolkit so make sure to check and see if you have any additional steps or compatibility issues that you should be aware of. There are some additional packages that need to be installed for implementing the hello world application, as well as some optional software such as a relational database, for this I installed PostgreSQL and psqlODBC which is the driver for PostgreSQL. Setting up the environment was easy enough to do for me to say that anyone should be able to follow the tutorial up to this point.

Globus Toolkit Installation

Before installing the toolkit itself, its good to understand the model that the Globus Toolkit is to follow. There will be a host, node A, which has a client, node B, as well as a Certificate Authority, node C (certificates explained in the next section). The host has clients who connect and can ask for jobs to be completed which is managed by the host, almost like a thread scheduler in an operating system, it manages the jobs that are given. The difference would be that the host has to also manage the resources being shared and it provides services such as control over Computing / Processing Power (GRAM), Data Management (GridFTP, DAI, RLS), Monitoring/Discovery (MDS), and Authorization/Security (CAS). The grid system is supposed to be used upon multiple machines, but for the sake of testing/simulation, it can be simply implemented upon a single computer, and then later branched out to other nodes if you wanted.

There are a couple of reoccurring steps that are necessary for the installation steps given to work. The first one being the creation of a non-privileged user named "globus", who is to perform the administrative tasks and deploy services, so the "globus" user would be the host node. I then made a directory to where I was going to install the toolkit and gave read/write permissions to the "globus" user on that folder. The next thing I had to do was sign in as the globus user and set the GLOBUS_LOCATION variable to the directory I had made by using the

globus@globus:export GLOBUS_LOCATION=/usr/local/globus-5.0.3

command in the terminal. The directory should contain the contents of the Globus Toolkit installation tar which can be downloaded from here. The GLOBUS_LOCATION variable should be always set since the tutorial asks for its use very often. Then I ran

globus@globus:./configure --prefix=$GLOBUS_LOCATION 
globus@globus:make

and it took approximately 30 minutes on my computer, then I ran

globus@globus:make install

If you have followed all of the steps before hand correctly, there shouldn't be any errors occurring, if there is then it most likely will be some error of the genre of "missing library" or something that can be easily installed. At the end of the installation I found myself relieved of the completed installation and thought to myself that this was not so hard to install, so at this point it seems that using the Toolkit is not so difficult after all.

Certification

The certification process is used in a Globus grid because of the security/authentication that is necessary for a scalable system to operate. If any random node in the world wanted to participate in the grid and use

Example alt text
SimpleCA setup
Example alt text
User Certificate Request

it or even abuse it without being accountable, it would not only cause possible harm to the work that is trying to be done on the grid, but also it will require the grid to be closely managed for these random inconsistencies in service and bad data. The system itself already tries to maintain the purity of nodes by doing multiple instances of the same operations over many nodes to have consistent results, as well as marking the nodes that may be malicious would be a nightmare if there was no certification process being implemented. The certification process allows for the host to know that a user is indeed a trusted user and that some organization is held responsible for vouching for this user. In real world implementations the Certified Authority(CA) (the one who signs the certificate) would normally be of some trusted organization such as how VeriSign(12) operates, but in the case of my testings I will simply allow for the globus user to be the CA. I used the tool that the toolkit supports called SimpleCA(13) which provides a wrapper around the OpenSSL CA functionality and is sufficient for simple Grid services.

SimpleCA The SimpleCA allows a user to become the CA for a grid and permits the CA to sign certificates that it has been issued for the grid host. Following the SimpleCA was easy since it provided a script that can be run that will ask for certain fields to be completed. An important thing to note was the email you set your CA to, because anytime a new user wanted to be part of that grid, it would need to email a certificate to the CA email so that it can be signed and returned. The next thing to note was the expiration date of the CA certificate, since after some point it is good practice for the CA to have to renew its certification. The default is set to 5 years(1825 days), so I left it as default. The most important thing to memorize is the CA PEM because this password is used for the CA to sign each certificate that they are given, a CA without the password is a useless CA and would require re-certification, which mean running the script again. One issue that I came across which was not mentioned in the tutorial was the installation of globus-gsi-cert-utils-progs package which is necessary for the signing process. After creating a CA profile it is necessary as they have noted, to execute the command

$GLOBUS_LOCATION/setup/globus_simple_ca_CA_Hash_setup/setup-gsi -default 

which completes the CA setup process. What is the point of a CA without a host/client relationship to sign certificates for, so therefore creating a host certificate is the next step i took.

On my host node(root account) I requested a certificate by calling

grid-cert-request -host 'hostname'

This created 3 files:

  • /etc/grid-security/hostkey.pem
  • /etc/grid-security/hostcert_request.pem
  • /etc/grid-security/hostcert.pem (which is empty)

One of the complications of creating a grid is the organization and consideration of which account/node is responsible for which service. I imagine if you were to implement this system as a team it would be a lot more simple and easier to manage. The empty hostcert.pem file is empty because it has yet to be signed by the CA. Hopping onto the CA(globus) I use the

grid-ca-sign -in /etc/grid-security/hostcert_request.pem -out hostsigned.pem 

command and the CA password to verify the CA identity to create a signed certificate. This hostsigned.pem should then be copied over into the /etc/grid-security/ directory and be renamed as hostcert.pem to replace the empty one. After the host has been signed, users may now request certificates with the grid-cert-request command, but will have to email the certificate manually to the CA so that the CA can sign it just as it did for the host, then return the signed.pem file so that it can be renamed usercert.pem so that it replaces the empty one that was made during the certification request. There are further instructions in the tutorial about how to configure the certification upon multiple machines by copying over a package generated by the SimpleCA service, but I did not use multiple machines so I will not go into that detail, but by reading over it quickly, it doesn't seem that it would be too difficult to implement.

Evaluation of Installation

After installing the toolkit and setting up the certification process brings the tutorial to a point where the additional information is for extra services and options that include firewall security, data management, execution management, and usage statistics. Setting up the environment was not hard but tedious, as for the installation of the toolkit itself was not hard but it took the most time because of the unpacking and creation of files. Setting up the SimpleCA was a little more tricky because of the constant switching of accounts, but once again, it would probably easier to manage had I been working with a group to implement this system. I flipped through the rest of the tutorial on installing the toolkit and saw that utilizing the other services such as a firewall, GridFTP, and GRAM5(used to measure usage) would take a lot more time and perhaps a little more knownledge of networking and capturing services to fully understand and use. There is definitely some areas in the installation tutorial which could be reworded, such as in the section 5.1 Set environment variables. They stated:

Set environment variables
In order for the system to know the location of the Globus Toolkit commands you just installed, you must set an environment variable and source the globus-user-env.sh script.


1. As globus, set GLOBUS_LOCATION to where you installed the Globus Toolkit. This will be one of the following:


Using Bourne shells:

globus$ export GLOBUS_LOCATION=/path/to/install
Using csh:

globus$ setenv GLOBUS_LOCATION /path/to/install


2. Source $GLOBUS_LOCATION/etc/globus-user-env.{sh,csh} depending on your shell.


Use .sh for Bourne shell:

globus$ . $GLOBUS_LOCATION/etc/globus-user-env.sh
Use .csh for C shell.

globus$ source $GLOBUS_LOCATION/etc/globus-user-env.csh

The first instruction is crystal clear as to what to type, where as the second instruction seems to fall short of telling you to write "source" when using a Bourne shell, which was an issue I had when setting these variables. So they should change globus$ . $GLOBUS_LOCATION/etc/globus-user-env.sh to globus$ source $GLOBUS_LOCATION/etc/globus-user-env.sh and the instruction at 2 to be in a similar style that was used in instruction 1.

Hello World Implementation

There was only a few different tutorials I could find that taught you how to implement a program onto the grid you have created. There was one tutorial that implemented an application that has a video file be edited(by some process) on each node in the grid and they return their edited pieces so that it can be remade into a complete edited video(from the Globus Consortium(14). That tutorial seemed quite complicated for me and would require more time to implement. Another tutorial that was available was linked from the Globus Toolkit homepage which explains in large detail of how web services are an integral part to the grid network and the infrastructure of the Globus Toolkit (Globus Tutorial(15)). This was too extensive of a tutorial considering it was 191 pages in length, so I opted to try and implement a simple java based hello world(16) example I found. This tutorial required the usage of Java, Ant, JUnit, Postgresql and JDBC(java API for sql) to be installed, all of which can be found in the Synaptic Package Manager on Ubuntu. After this additional installation, I started the Postgresql server and activated the grid container which enables for jobs to be done on the grid.(the postgresql server was started in the root account, and the globus container started in the globus account)

Diagram of file structure used in tutorial. (self drawn)(16)

By following the tutorial I created the files :

  • build.properties
  • build.xml
  • interface-source/hello/HelloWorld.java
  • server-source/hello/HelloWorldImpl.java
  • wsdd-source/server-deploy.wsdd
  • client-source/helloc/HelloWorldClient.java

from copying and pasting code in from the appendix and put them into the directory /home/mateh/dev/HelloProject/. Since the tutorial had me build a script, I can use the script to generate the files I need by simply calling ant step1 and ant step2 (which generates a gar file, which is a Grid Archive file similar in idea to a jar). The gar file that was created on the user mateh must then be copied onto the globus account for it to be deployed. After copying the file into the globus toolkit installation directory, using the command

ant deploy -Dgar.name=/usr/local/globus/gars/hellogrid.gar

I deployed the gar file. It is important to reset the grid container at this point to ensure the new grid service will be deployed. Now that the server implementation has been completed, I can hop back onto the user account mateh and build the client file using ant step3 and then running ant run.client.

Implementation Overview

Implementing the Hello World webservice was easy based on the fact that it required only a few extra installations of packages, and the code was already given and was merely needed to be copied and pasted. This tutorial was far more easier to follow than the previous ones that I have seen, and I would recommend it to people who want to try out a grid computing system. The issue behind implementing these services would seem to me as difficult by first coming up with a job that can be split into multiple jobs and not have the worry of time sensitive deadlines, and once the job has been decided upon, to what amount can it be divided into? This Hello World example simply outputted text on different nodes, hardly a suitable challenge for a grid computing system, but to have to come up with a task that must delegate the amount of divisions of work must take lots of planning. Once this planning has been done, further work must be done into coming up with different ways the grid system will handle invalid data so as to not hinder the operation of that job. Therefore I believe that implementing the Hello World Tutorial was simple based on the given code and explanation, but in a real world situation I think that trying to come up with an application that is grid based as well making sure that it works properly with the grid would be a different matter completely.

Conclusion

After setting up my environment, installing the toolkit, and implementing the Hello World tutorial, I can say that the setup and installation of the Globus Toolkit was relatively smooth, and the implementation of an application onto the grid would have been a nightmare had I not found the tutorial that I did. Even though I did say that the tutorial was easy, it does change my mind about how it would be difficult to implement an application on the grid because the Hello World example does not seem like a fair judgment to how complicated the applications that exist on the grid of today. The one thing that I have learned from the following the Globus Toolkit tutorials on the Globus Alliance web page would be that they have very specific instructions at time, and very vague ones as well. The challenge to getting resources on how to implement a grid computing system also depends vastly on the version you are attempting to install. I have found many different tutorials, such as the ones i mentioned, who are made for earlier versions of the Globus Toolkit, some of which that do not have features that the future ones hold. This causes much confusion for as to what different commands are called and what services may or may not be available. I have looked into other grid computing systems and have also found an alternative method to testing a grid system. The Virtual Data Kit(17) apparently contains a wide variety of distributed computing software that can be easily used, so it seems to be a worthy solution to follow, as well as the implementation of already built services on the grid such as Condor might be something to look into how it is done. Therefore, from my experimentation of the Globus Toolkit 5.0.3, I have concluded that it is simple enough to install the basic services and setup a Certification Authority, but remains difficult to create/implement your own services onto the grid without further knowledge of grid computing and networks in general.

References

Reference relations:

Installation: (1),(9),(10),(11),(13).

Implementation tutorials: (14)Video filtering program, (15)Webservice, (16)Helloworld.

Information:(1),(2),(3),(4),(5),(6),(7),(8),(12),(17).