Skip to main content
Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/JobDefinitions"

(Jobs)
Line 5: Line 5:
 
<span style="color:#ff0000">'''This page is work in progress.'''</span>
 
<span style="color:#ff0000">'''This page is work in progress.'''</span>
  
== Job Definitions ==
+
To run a certain workflow in SMILA, you will have to create a job definition first that references a workflow and sets the desired parameters. With a job definition alone, the system is not yet doing anything. First, the job must be started to get a so called [[SMILA/Documentation/JobRun|job run]]. For job runs in "standard" mode, the actual job processing is triggered with every new object that is dropped into the bucket connected to the start action of the respective workflow. The triggering continues with new objects until the job run is finished manually. Job runs in "runOnce" mode, by contrast, do not react on new objects but process all objects currently contained in the respective input bucket and then finish automatically.
  
The user can create a job by choosing a workflow and setting all its parameters. All parameter variables of the used data object types and workers that are not set as bucket or workflow parameters have to be set as job parameters.
+
== Job definition ==
  
Jobs are linked by sharing <tt>persistent</tt> buckets: A workflow having a start action worker with an input section is started when another workflow adds objects to the <tt>persistent</tt> bucket associated to this worker's input slot. (Note: this only works with <tt>persistent</tt> input buckets, not transient ones!)
+
* <tt>name</tt>: Required. Defines the name of the job.
 +
* <tt>parameters</tt>: Optional. Defines the [[SMILA/Documentation/JobParameters|job parameters]] that will be resolved in the workflow to configure the participating workers and to instantiate the buckets. All parameter (variables) that are declared in the used data object types and workers and that have not yet been set in the workflow or bucket definitions must be set here at the latest. Otherwise an error will occur when trying to create the job.
 +
* <tt>workflow</tt>: Required. Gives the name of the desired workflow.
  
The job definition is provided by the user. It defines the workflow to be used along with the parameters to configure the workflow.
+
== Example ==
* <tt>name</tt> is mandatory and defines the name of the job.
+
An exemplary job definition:  
** If <tt>name</tt> is missing an exception occurs.
+
** There will be an error when trying to add a job with the same name as an existing job.
+
* <tt>parameters</tt>
+
** This is optional and defines the job parameters that will be resolved in the workflow to configure the participating workers and instantiate the buckets.
+
** All parameter (variables) that are still not set in the worker, bucket and data object type definitions used by the job workflow must be set here. Otherwise an error will occur when trying to add the job.
+
* <tt>workflow</tt>
+
** mandatory field. If the field is missing an exception occurs when trying to add the job.
+
** references an existing workflow in the jobmanager. If the workflow with the given name does not exist, there will be an error when trying to add the job.
+
 
+
Sample:
+
  
 
<pre>
 
<pre>
 
{
 
{
   "name":"indexUpdateTestJob",
+
   "name":"myJob",
 
   "parameters":{
 
   "parameters":{
     "index":"wikipedia",
+
     "index": "wikipedia",
     "store":"wikidocs"
+
     "store": "wikidocs"
 
   },
 
   },
   "workflow":"indexUpdate"
+
   "workflow":"myWorkflow"
 
}
 
}
 
</pre>
 
</pre>
  
=== Monitor jobs ===
+
== List, create, modify jobs ==
==== All jobs ====
+
=== All jobs ===
  
Use a GET request to retrieve monitoring information for all jobs. Use POST for adding or updating a job.
+
Use a GET request to retrieve a list of all job definitions. Use POST for adding or updating a job definition.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
*GET: Get a list of all defined job definitions and details about latest job run. Switch off details with <tt>returnDetails=false</tt> parameter. If there are no jobs defined, you will get an empty list.
+
*GET: Get a list of all job definitions and details about latest job run. Switch off details with <tt>returnDetails=false</tt> as a URL parameter. If there are no jobs defined, you will get an empty list.
*POST: Put one new job or update existing one. If an already existing name is used, the existing job definition will be updated if validation was successful. If you update during a running job, the running job will not be influenced, but after the next start of the job the updated job definition will be used.
+
*POST: Create a new job definition or update an existing one. If the job already exists, it will be updated after successful validation. However, the changes will not apply until the next job run, i.e. the current job run is not influenced by the changes.
 
+
<pre> If you POST more than one workflow only the first job will be added, all following workflows will be ignored.</pre>
+
  
 
'''Usage:'''  
 
'''Usage:'''  
  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/jobs</nowiki></tt>.
+
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/jobs/</nowiki></tt>
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
Line 55: Line 45:
 
**200 OK: Upon successful execution (GET).  
 
**200 OK: Upon successful execution (GET).  
 
**201 CREATED: Upon successful execution (POST).
 
**201 CREATED: Upon successful execution (POST).
**400 Bad Request: If you reference undefined workflows, not all parameters are resolved, mandatory fields are missing or validation finds errors you get a Bad Request (POST).  
+
**400 Bad Request: If you reference undefined workflows, if not all parameters were resolved, if mandatory fields are missing or if validation finds errors (POST).  
  
==== Specific job ====
+
=== Specific job ===
  
Use a GET request to retrieve monitoring information for a specific job. Use DELETE for deleting a job.
+
Use a GET request to retrieve the definition of a specific job. Use DELETE for deleting a job.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
*GET: retrieve information about a job definition with a given name.
+
*GET: get the definition of the given job.
*DELETE: delete a job with a given name.
+
*DELETE: delete the given job definition.
  
 
'''Usage:'''  
 
'''Usage:'''  
  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/jobs/<job-name></nowiki></tt>.
+
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/jobs/<job-name>/</nowiki></tt>
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
 
**DELETE
 
**DELETE
 
*Response status codes:  
 
*Response status codes:  
**200 OK: Upon successful execution (GET, DELETE). If the to be deleted job does not exist you will get a 200 anyway.  
+
**200 OK: Upon successful execution (GET, DELETE). If the job definition to be deleted does not exist you will get a 200 anyway.  
**404 Server Error: If a wrong name is used, a HTTP 404 Server Error is followed by an error in json format (GET).
+
**404 Server Error: If an undefined name is used, an HTTP 404 Server Error including an error message in the response body will be returned.

Revision as of 11:24, 12 July 2011


Jobs

This page is work in progress.

To run a certain workflow in SMILA, you will have to create a job definition first that references a workflow and sets the desired parameters. With a job definition alone, the system is not yet doing anything. First, the job must be started to get a so called job run. For job runs in "standard" mode, the actual job processing is triggered with every new object that is dropped into the bucket connected to the start action of the respective workflow. The triggering continues with new objects until the job run is finished manually. Job runs in "runOnce" mode, by contrast, do not react on new objects but process all objects currently contained in the respective input bucket and then finish automatically.

Job definition

  • name: Required. Defines the name of the job.
  • parameters: Optional. Defines the job parameters that will be resolved in the workflow to configure the participating workers and to instantiate the buckets. All parameter (variables) that are declared in the used data object types and workers and that have not yet been set in the workflow or bucket definitions must be set here at the latest. Otherwise an error will occur when trying to create the job.
  • workflow: Required. Gives the name of the desired workflow.

Example

An exemplary job definition:

{
  "name":"myJob",
  "parameters":{
    "index": "wikipedia",
    "store": "wikidocs"
   },
  "workflow":"myWorkflow"
}

List, create, modify jobs

All jobs

Use a GET request to retrieve a list of all job definitions. Use POST for adding or updating a job definition.

Supported operations:

  • GET: Get a list of all job definitions and details about latest job run. Switch off details with returnDetails=false as a URL parameter. If there are no jobs defined, you will get an empty list.
  • POST: Create a new job definition or update an existing one. If the job already exists, it will be updated after successful validation. However, the changes will not apply until the next job run, i.e. the current job run is not influenced by the changes.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/jobs/
  • Allowed methods:
    • GET
    • POST
  • Response status codes:
    • 200 OK: Upon successful execution (GET).
    • 201 CREATED: Upon successful execution (POST).
    • 400 Bad Request: If you reference undefined workflows, if not all parameters were resolved, if mandatory fields are missing or if validation finds errors (POST).

Specific job

Use a GET request to retrieve the definition of a specific job. Use DELETE for deleting a job.

Supported operations:

  • GET: get the definition of the given job.
  • DELETE: delete the given job definition.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/jobs/<job-name>/
  • Allowed methods:
    • GET
    • DELETE
  • Response status codes:
    • 200 OK: Upon successful execution (GET, DELETE). If the job definition to be deleted does not exist you will get a 200 anyway.
    • 404 Server Error: If an undefined name is used, an HTTP 404 Server Error including an error message in the response body will be returned.

Back to the top