Skip to main content
Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/JobDefinitions"

(Specific job)
(Specific job)
Line 110: Line 110:
 
**GET
 
**GET
 
**DELETE
 
**DELETE
**POST: see [[SMILA/Documentation/JobRuns]]
+
**POST: see [[SMILA/Documentation/JobRuns#Start_job_run]]
 
*Response status codes:  
 
*Response status codes:  
 
**200 OK: Upon successful execution (GET, DELETE). If the job definition to be deleted does not exist you will get a 200 anyway.  
 
**200 OK: Upon successful execution (GET, DELETE). If the job definition to be deleted does not exist you will get a 200 anyway.  

Revision as of 05:03, 30 September 2011


Job definitions

To run a certain workflow in SMILA, you will have to create a job definition first that references a workflow and sets the desired parameters. With a job definition alone, the system is not yet doing anything. First, the job must be started to get a so called job run. For job runs in "standard" mode, the actual job processing is triggered with every new object that is dropped into the bucket connected to the start action of the respective workflow. The triggering continues with new objects until the job run is finished manually. Job runs in "runOnce" mode, by contrast, do not react on new objects but process all objects currently contained in the respective input bucket and then finish automatically.

Job properties in detail

  • name: Required. Defines the name of the job.
  • parameters: Optional. Defines the job parameters that will be resolved in the workflow to configure the participating workers and to instantiate the buckets. All parameter (variables) that are declared in the used data object types and workers and that have not yet been set in the workflow or bucket definitions must be set here at the latest. Otherwise an error will occur when trying to create the job.
  • workflow: Required. Gives the name of the desired workflow.

Job definitions can include additional information (e.g. comments or additional for external tools, etc.), but a GET request will return only relevant information (i.e. the above attributes). If you want to retrieve the additional info that is present in the json file or has been posted with the definition, add returnDetails=true as request parameter.

Example

An exemplary job definition:

{
  "name":"myJob",
  "parameters":{
    "index": "wikipedia",
    "store": "wikidocs"
   },
  "workflow":"myWorkflow"
}

List, create, modify jobs

All jobs

Use a GET request to retrieve a list of all job definitions. Use POST for adding or updating a job definition.

Supported operations:

  • GET: Get a list of all job definitions and details about latest job run. Switch off details with returnDetails=false as a URL parameter. If there are no jobs defined, you will get an empty list.
  • POST: Create a new job definition or update an existing one. If the job already exists, it will be updated after successful validation. However, the changes will not apply until the next job run, i.e. the current job run is not influenced by the changes.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/jobs/
  • Allowed methods:
    • GET
    • POST
  • Response status codes:
    • 200 OK: Upon successful execution (GET).
    • 201 CREATED: Upon successful execution (POST).
    • 400 Bad Request: If you reference undefined workflows, if not all parameters were resolved, if mandatory fields are missing or if validation finds errors (POST).

Examples:

To get all job definitions:

GET /smila/jobmanager/jobs/

The result would be:

HTTP/1.x 200 OK

{
  "jobs" : [ {
    "name" : "myJob",
    "url" : "http://localhost:8080/smila/jobmanager/jobs/myJob/"
  } ]
}

To create a job:

POST /smila/jobmanager/jobs/

{
  "name":"myJob",
  "parameters":{
    "index": "wikipedia",
    "store": "wikidocs"
   },
  "workflow":"myWorkflow"
}

The result would be:

HTTP/1.x 201 CREATED

{
  "name" : "myJob",
  "timestamp": "2011-08-12T14:49:48.862+0200",
  "url" : "http://localhost:8080/smila/jobmanager/jobs/myJob/"
}

Specific job

Use a GET request to retrieve the definition of a specific job. Use DELETE for deleting a job.

Supported operations:

  • GET: get the definition of the given job.
    • You can set the URL parameter returnDetails to true to return additional information that might have been provided when creating the job. If the parameter is ommitted or set to false only the relevant information (see above) is gathered.
  • DELETE: delete the given job definition.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/jobs/<job-name>/
  • Allowed methods:
  • Response status codes:
    • 200 OK: Upon successful execution (GET, DELETE). If the job definition to be deleted does not exist you will get a 200 anyway.
    • 404 Server Error: If an undefined name is used, an HTTP 404 Server Error including an error message in the response body will be returned.

Examples:

To get a specific job definition:

GET /smila/jobmanager/jobs/myJob/

The result would be (for a job that has not been yet started):

HTTP/1.x 200 OK

{
  "definition": {
    "name":"myJob",
    "timestamp": "2011-08-12T14:49:48.862+0200",
    "parameters":{
      "index": "wikipedia",
      "store": "wikidocs"
     },
    "workflow":"myWorkflow"
  } 
  "runs": {
    "current": { },
    "history": [ ]
  }
}

As job definition cannot be pre-defined in the system-configuration currently, they will always contain a "timestamp", but never a "readOnly" flag, when retrieved from the system.

Back to the top