Grid Interface API
In the absence of any written documentation on how to run tasks on
The Grid, I will write here how I think it ought to work and we can then discuss the implemenation details, or someone can add links to the proper documentation.
Required
The goal is to be able to run a LIGO analysis task "on the Grid". That means at a bare minimum I need to know how to:
- Launch a task - how do I launch a task to run "on the grid"? How similar is this to fork() or exec(), or running a task in the background or via 'batch' or 'at' on Unix? Does the task inherit the environment of the shell which launches the task? If not, how can I set environment variables needed for my task to run?
When you launch a task, you should probably get back some kind of process or task identifier or handle, so that you can refer to the task later (to query or kill). In Unix when you fork() the parent gets the PID of the child, or when you run something in background from a shell (or shell script) the PID of the background task is put in the shell variable
$!
.
What is the analogous "handle" returned to allow me to reference the task?
Q: How do you specify inputs (files or parameters) for the task?
- Query the status of a task - has it finished? Did it run okay. Is it still running? Is it waiting to run? What are the current estimates of when it will run, or when it will finish if it's already running? I probably need to provide the "handle" to identify the task.
Q: how do I get the returned files from the analysis one it finishes?
- Kill a running task - students will launch an analysis which will take a day or more to run without stopping to think about it, then be surprised and disappointed. We need a way for them to stop the task. We need a way for someone else (with the proper authority? Like a teacher, or admin?) to be able to cancel the task too.
Optional/Desirable
On top of that, some of the things I would like to be able to do, but which are not required:
- Preview task performance - ideally the grid software will determine the best place to run the task and will make the proper choice for me. But it would be interesting to be presented with a list of alternatives, with estimates for each of the execution time and perhaps other resources used. It might just help show that the scheduling software is doing it right, or it might allow me to override the default decision.
- Alter a task - perhaps I would like to change something about the task without canceling it, like change where it runs (provided it has not started).
- List Tasks - list the status of all tasks which are either running or waiting to run. A student could only list their own tasks, but a teacher could list all the tasks in the class, or an admin process could list all tasks running or waiting to run.
-- Main.EricMyers - 14 Jun 2007