DistOS-2011W Globus: Difference between revisions

From Soma-notes
Mchou2 (talk | contribs)
Mchou2 (talk | contribs)
Line 12: Line 12:
The system that I have attempted to implement was the Globus ToolKit, and from my current knowledge and understanding it seems quite difficult to create software that can be used on a grid system. While searching for different distributed systems I have come across different systems that are utilized for various fields of research, such as BOINC[http://boinc.berkeley.edu/], folding@home[http://folding.stanford.edu/], and many other @home projects as well as Condor[http://www.cs.wisc.edu/condor/].(list can be found [http://en.wikipedia.org/wiki/List_of_distributed_computing_projects here]) Seeing as there are many of these types of systems being used around the world, I thought it would interesting to see what steps it would take to implement such a system and to see if I could get something to run.
The system that I have attempted to implement was the Globus ToolKit, and from my current knowledge and understanding it seems quite difficult to create software that can be used on a grid system. While searching for different distributed systems I have come across different systems that are utilized for various fields of research, such as BOINC[http://boinc.berkeley.edu/], folding@home[http://folding.stanford.edu/], and many other @home projects as well as Condor[http://www.cs.wisc.edu/condor/].(list can be found [http://en.wikipedia.org/wiki/List_of_distributed_computing_projects here]) Seeing as there are many of these types of systems being used around the world, I thought it would interesting to see what steps it would take to implement such a system and to see if I could get something to run.


=Background=
Background


The basis of grid computing has been around for quite some time, starting with super computers which have many processors and vasts amounts of memory working under one machine. The idea behind a supercomputer, allowing the power of many machines to be run under one machine gives that one "single" machine a large threshold of computational power, and that itself is the idea that is to be implemented in grid computing. Multiple machines across various geographic distances working together to give power to any one machine that requires it. The thought of such a system seems like an obvious task to do, but with such a system it gives rise to different implementation issues that current operating systems have to deal with, such as  heterogeneity, scalability, and adaptability. Heterogeneity refers to having standards of use and data that the grid can follow so that when other domains are part of the grid there is no problems with integration and usage of the grid. Scalability is necessary for when the grid size increases because of the scalability of application usage on the grid and the organization/management of jobs distributed across it. Adaptability is also necessary because at any given point a node on the grid may go down and the job it had been doing must be done by another node, so with increased scalability the grid can utilize multiple nodes to adapt to dropped nodes as well as malicious nodes who might attempt to do incorrect jobs.
The basis of grid computing has been around for quite some time, starting with super computers which have many processors and vasts amounts of memory working under one machine. The idea behind a supercomputer, allowing the power of many machines to be run under one machine gives that one "single" machine a large threshold of computational power, and that itself is the idea that is to be implemented in grid computing. Multiple machines across various geographic distances working together to give power to any one machine that requires it. The thought of such a system seems like an obvious task to do, but with such a system it gives rise to different implementation issues that current operating systems have to deal with, such as  heterogeneity, scalability, and adaptability. Heterogeneity refers to having standards of use and data that the grid can follow so that when other domains are part of the grid there is no problems with integration and usage of the grid. Scalability is necessary for when the grid size increases because of the scalability of application usage on the grid and the organization/management of jobs distributed across it. Adaptability is also necessary because at any given point a node on the grid may go down and the job it had been doing must be done by another node, so with increased scalability the grid can utilize multiple nodes to adapt to dropped nodes as well as malicious nodes who might attempt to do incorrect jobs.

Revision as of 23:25, 23 February 2011

Matthew Chou

Introduction/Background

Describe the system(s) that you examined or compared. Why did you choose them? Be sure to specify a thesis that you argue in the rest of the document. Since this is a report the thesis may be relatively weak; however, an appropriate thesis will help the reader understand why did what you did and why you wrote what you wrote.

End with a paragraph outlining the rest of the document.

Be sure to change the titles of the following sections to match the structure of your paper. In particular, please try to make them less generic. What follows is just a suggestion; the document will be evaluated in part on the quality of writing, and good writing sometimes requires some flexibility.


The system that I have attempted to implement was the Globus ToolKit, and from my current knowledge and understanding it seems quite difficult to create software that can be used on a grid system. While searching for different distributed systems I have come across different systems that are utilized for various fields of research, such as BOINC[1], folding@home[2], and many other @home projects as well as Condor[3].(list can be found here) Seeing as there are many of these types of systems being used around the world, I thought it would interesting to see what steps it would take to implement such a system and to see if I could get something to run.

Background

The basis of grid computing has been around for quite some time, starting with super computers which have many processors and vasts amounts of memory working under one machine. The idea behind a supercomputer, allowing the power of many machines to be run under one machine gives that one "single" machine a large threshold of computational power, and that itself is the idea that is to be implemented in grid computing. Multiple machines across various geographic distances working together to give power to any one machine that requires it. The thought of such a system seems like an obvious task to do, but with such a system it gives rise to different implementation issues that current operating systems have to deal with, such as heterogeneity, scalability, and adaptability. Heterogeneity refers to having standards of use and data that the grid can follow so that when other domains are part of the grid there is no problems with integration and usage of the grid. Scalability is necessary for when the grid size increases because of the scalability of application usage on the grid and the organization/management of jobs distributed across it. Adaptability is also necessary because at any given point a node on the grid may go down and the job it had been doing must be done by another node, so with increased scalability the grid can utilize multiple nodes to adapt to dropped nodes as well as malicious nodes who might attempt to do incorrect jobs.

Systems/Programs in the Space

Give an overview of the area you are examining. What systems/programs are out there?

Installation

Attempted to install on february 10th 2011, but problems with Ubuntu image occured. Installation on february 15th 12:00-12:53

many iterations of the form "gpt-build ====> CHECKING BUILD DEPENDENCIES FOR globus_ftp_client_test gpt-build ====> Changing to /usr/local/globus-5.0.3/gt5.0.3-all-source-installer/source-trees/gridftp/client/test gpt-build ====> BUILDING FLAVOR gcc32dbg gpt-build ====> Changing to /usr/local/globus-5.0.3/etc /usr/local/globus-5.0.3/sbin/gpt-build -srcdir=source-trees/gsi/gssapi/test gcc32dbg "

in 1. Set environment variables, it isn't well documented to type

Source $GLOBUS_LOCATION/etc/globus-user-env.sh when it stated to write


globus$ . $GLOBUS_LOCATION/etc/globus-user-env.sh

-apt-get install globus-gsi-cert-utils-progs was needed to be installed and not mentioned for creating certificates

Evaluated Systems/Programs

Describe the systems individually here - their key properties, etc. Use subsections to describe different implementations if you wish. Briefly explain why you made the selections you did.

Experiences/Comparison (multiple sections)

In multiple sections, describe what you learned.

Discussion

What was interesting? What was surprising? Here you can go out on tangents relating to your work

Conclusion

Summarize the report, point to future work.

References

Give references in proper form (not just URLs if possible, give dates of access). Install guide: http://www.globus.org/toolkit/docs/latest-stable/admin/install/#gtadmin http://www.globus.org/toolkit/docs/5.0/5.0.3/admin/quickstart/ Installer source: http://www.globus.org/toolkit/downloads/5.0.3/