(r2) BasicUsage < RmiGrid

RmiGrid Web>BasicUsage (revision 2) (raw view)~~EditAttach~~
---++ How to submit simple jobs onto the Grid


On this page, we intend to provide a brief introduction on simple job submission onto the Grid. One can also find a brief introductory material on the [[http://www.grid.kfki.hu][Grid homepage of RMKI]].


---+++ Log onto a User Interface (_UI_) machine

After you have logged onto a UI machine, you are able to submit commands to the Grid.


---+++ Log onto the Grid (get authenticated on the Grid)

This means getting a so called _user proxy_. Commands are:

=> grid-proxy-init=        Here, you will be prompted for your grid password. Or:

=> grid-proxy-init  -valid 4:00=        This is the same, but the authentication will expire in 4 hours (default lifetime is 12 hours).

If you are member of more then one VO, you can choose between them by using the =voms-proxy-init= for logging in, instead of =grid-proxy-init=. E.g.:

=> voms-proxy-init  -voms hungrid=        Or:

=> voms-proxy-init  -voms hungrid  -valid 4:00=

To get information on your user proxy, you can use the commands =grid-proxy-info= or =voms-proxy-info=.


---+++ Get your jobs authenticated on the Grid

This means getting a so called _job proxy_. Commands are:

=> myproxy-init=        Here, you will be prompted for your grid password, and to specify an additional password (for extra protection) attached to your so called _job proxy_, to be created.

=> myproxy-init  -n=        This is the same, but you won't be asked to specify an additional password for protecting your job proxy.

Running =myproxy-init= is necessary when you are running long-term jobs. Having a job proxy ensures that your jobs still will be authenticated on the Grid, even though your user proxy (used to perform interactive Grid manupulations) may have had expired. You can get information on your job proxy by =myproxy-info=. You can destroy your job proxy by =myproxy-destroy=. The default lifetime of a job proxy is 168 hours.

_Note_: If you don't get a job proxy, you may not be able to retrieve your job outputs for long-term jobs!


---+++ Prepare and submit your job

The programme which you want to run on the Grid is called a _job_. These consist of some executable(s) and some input(s), which can be submitted to the Grid system. The result shall be some output(s), which can be retrieved after your job has finished.

The job specifications are described for the Grid system by the so called _Job Description Language_ (_JDL_). For each of your jobs, you should prepare a JDL file. An example for a typical simple JDL file content may be:

<verbatim>
[

JobType = "Normal"

Executable = "testjob.sh";

StdOutput = "testjob.stdout";

StdError = "testjob.stdout";

InputSandbox = {"testjob.sh", "inputfile.dat"};

OutputSandbox = {"testjob.stdout", "outputfile.dat"};

Requirements = (
                 Member("AFS", other.GlueHostApplicationSoftwareRunTimeEnvironment) && 
                 other.GlueCEUniqueID=="grid109.kfki.hu:2119/jobmanager-lcgpbs-hungrid"
                     );

]
</verbatim>

The meaning of the above variables are:

   $ =JobType=: This optional variable describes whether your job is a normal job (="Normal"=), or interactive (="Interactive"=). If your job is interactive, the !StdIn, !StdOut and !StdError shall be connected to the terminal, from where you submitted the job, so you are able to communicate with the job during the running time. If unspecified, defaults to ="Normal"=.
   $ =Executable=: This variable specifies the executable file of your job.
   $ =StdOutput=: The !StdOut of your program shall be written into this file.
   $ =StdError=: The !StdError of your program shall be written into this file.
   $ =InputSandbox=: This is a list of files, which are sent to the system as the components of your job. Typically the executable of your program, and some supplementary files. The size of the files, sent via the =InputSandbox=, should be small (<10MB). Large files (as large input data files) should be communicated to the job by other ways, e.g. via AFS, NFS, or the internet (using e.g. =wget=), or via the Grid Storage System.
   $ =OutputSandbox=: This is a list of files, which are retrieved after the job has finished. Typically the file containing the !StdOut / !StdError and some output files. The size of the files, retrieved via =OutputSandbox=, should be small (<100MB). Large files (as large output data files) should be transfered by other means, e.g. via the Grid Storage System.
   $ =Requirements=: This optional variable may be a logical expression, specifying requirements for site or the node, where the job is going to be executed. =Member("Some_software", other.GlueHostApplicationSoftwareRunTimeEnvironment)= means the requirement of the software =Some_software= on the target node. =other.GlueCEUniqueID=="grid109.kfki.hu:2119/jobmanager-lcgpbs-hungrid"= means the requirement that the job should be sent to the computing element (queue) =grid109.kfki.hu:2119/jobmanager-lcgpbs-hungrid=.

Once you prepared the JDL file, you can look for the available queues, which are capable of running your job, by the command:
<verbatim>
> edg-list-match  -vo your_vo  testjob.jdl
</verbatim>
This will return a list of Grid queues (computing elements).

The job can be submitted by the command:
<verbatim>
> edg-job-submit  -vo your_vo  testjob.jdl
</verbatim>
This will return a sURL address, which is a unique identifier of your job, which shall be denoted by =jobID= in the followings.

The status of the job can be viewed by:
<verbatim>
> edg-job-get-status  jobID
</verbatim>
This will return the current status of your job.

If your job has failed to be ran by the Grid system, the logging may be retrieved by:
<verbatim>
> edg-job-get-logging-info  jobID
</verbatim>
This will return the logging info on your job. A convenient way to find out failure reasons is:
<verbatim>
> edg-job-get-logging-info  -v 2  jobID  > log
> grep  "reason"  log  | uniq
</verbatim>
This will return all the available logging info on your job (=-v 2= switch), and shall write it into the file =log=. The second command line lists the unique lines of the file =log=, containing the string =reason=, which will tell the reasons for various actions of the Grid system.

If your job has properly finished, you can retrieve the outputs by the command:
<verbatim>
> edg-job-get-output  jobID
</verbatim>
This will retrieve the content of the =OutputSandbox= into the directory =/tmp/jobOutput/yourusername_jobID=.

For further information, look at the =man= pages of the above commands, and maybe also to the =man= pages of other =edg-= commands. For further references on simple job submission, see [[https://edms.cern.ch/file/454439//LCG-2-UserGuide.html][https://edms.cern.ch/file/454439//LCG-2-UserGuide.html]].




-- Main.AndrasLaszlo - 17 Sep 2007
Topic revision: r2 - 2007-09-17 - AndrasLaszlo
RmiGrid
Copyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback