Skip to main content
Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/WorkerAndWorkflows"

(New page: = Workers and Workflows = == Workers == === Worker Definition === The worker definition is provided with software. It defines default workers provided and must not be changed by the us...)
 
Line 1: Line 1:
 
 
= Workers and Workflows =
 
= Workers and Workflows =
  
Line 57: Line 56:
 
*Response status codes:  
 
*Response status codes:  
 
**200 OK: Upon successful execution (GET).
 
**200 OK: Upon successful execution (GET).
 +
 +
== Workflows ==
 +
 +
=== Workflow Definition ===
 +
Describes the work to be done by associating buckets to workers. All input and output slots of workers must be associated to buckets. The types of buckets must match the required bucket types described in the worker definition.
 +
 +
A workflow run starts with the start-action. The order of the other actions is determined by their inputs and outputs.
 +
 +
 +
Description of a workflow:
 +
*<tt>name</tt>
 +
** MANDATORY
 +
** The name of a workflow
 +
*<tt>parameters</tt> (MAP)
 +
** The parameters defined within this workflow as a map
 +
*<tt>startAction</tt> (MAP)
 +
** MANDATORY
 +
** the starting action of this workflow
 +
** there can be only one starting action
 +
*<tt>actions</tt> (LIST of MAPs)
 +
** the non-starting-actions of this workflow which
 +
 +
Description of an action
 +
*<tt>worker</tt>
 +
** the name of an existing worker definition
 +
*<tt>parameters</tt>
 +
** the parameters the workflow defines for this worker (not for the buckets the worker uses!)
 +
*<tt>input</tt> (MAP)
 +
** The mapping of the worker's named input slots (KEY) to an existing bucket definition (VALUE)
 +
** all of the worker's named input slots have to be resolved against an existing bucket of the same type.
 +
*<tt>output</tt> (MAP)
 +
** The mapping of the worker's named output slots (KEY) to an existing bucket definition (VALUE)
 +
** all of the worker's named output slots have to be resolved against an existing bucket of the same type.
 +
 +
 +
!Parameters
 +
 +
We have two kinds of parameters in the workflow definition:
 +
* ''Global workflow parameters'': Parameters that are set global for the whole workflow
 +
* ''Local worker parameters'': Parameters that are set local in the workflow for a single worker
 +
 +
<pre>
 +
the local worker parameters do not affect the buckets the worker uses.
 +
So the parameters used in the data object types are only resolved using global parameters (job or global workflow parameters).
 +
</pre>
 +
 +
Sample:
 +
<pre>
 +
    {
 +
      "name": "myWorkflow",
 +
      "parameters":
 +
        {
 +
            "myGlobalParam": "..."
 +
        }
 +
      "startAction":
 +
        {
 +
          "parameters":
 +
            {
 +
              "myLocalParam": "..."
 +
            }
 +
          "worker": "myWorker",
 +
      ...
 +
</pre>
 +
 +
Data object types and workers define parameter variables: ${...}
 +
* Needed data object type variables that are not set in a bucket parameter must be either set as workflow or job parameter.
 +
* Needed worker variables must be either set as workflow or job parameter.

Revision as of 07:56, 4 July 2011

Workers and Workflows

Workers

Worker Definition

The worker definition is provided with software. It defines default workers provided and must not be changed by the user.

Worker definitions cannot be added at runtime. They describe worker behaviour as needed by job manager to generate appropriate tasks and data objects.

Required input and output data is described in terms of bucket types as defined before. Additional string parameters may be needed and must be defined when used in workflow. Values for these parameters must be added as properties to the tasks created for this worker. (I.e. the names of all input and output slots have to be explicitly linked to names of existing buckets by the workflow referencing the workers as actions, see below. The workflow doesn't not need to define output slots which are marked as optional.)

As an advanced feature, output slots can be arranged into groups. The purpose of this is to describe which slots must or must not be used together: In a single workflow action it is not possible to use slot from different groups, but only slots of a single group and slots that are not marked with a group (they belong to each group implicitly). When using groups, the rules concerning optional and mandatory slots are as follows:

  • A non-optional slot without a group must be always be connected to a bucket.
  • An optional slot without a group is allowed in combination with any group slot.
  • If a group is used, all non-optional slots of the same group must be connected to a bucket, too.
  • If each group contains at least one non-optional slot, at least one group must be connected. It's not possible to use only the groupless slots then.

The worker properties in detail:

  • name is mandatory.
  • modes is optional and describes the mode of the worker
    • bulkSource: Can start a workflow, does not need input data. A task for this worker is created on demand when the worker requests it (in-progress tasks only)
    • autoCommit: When the worker dies while working on a task (sends no keep-alive anymore) the started bulks are committed by the job manager and follow-up actions are triggered, the task is not rolled back.
  • parameters is optional and describes the parameters needed to configure the worker
  • taskGenerator is optional and configures a piece of code (OSGi service) that is used to create the actual tasks after changes in the input buckets. Can be used to create multiple tasks for a single change event, or to filter events: If the generator does not actually create a task for the event, the action is cancelled.

Monitor workers

All workers

Use a GET request to retrieve monitoring information for all defined workers.

Supported operations:

  • GET: Returns the workers information.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/workers.
  • Allowed methods:
    • GET
  • Response status codes:
    • 200 OK: Upon successful execution (GET).

Specific worker

Use a GET request to retrieve monitoring information for a specific worker.

Supported operations:

  • GET: Returns the worker information for the given worker name.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/workers/<worker-name>.
  • Allowed methods:
    • GET
  • Response status codes:
    • 200 OK: Upon successful execution (GET).

Workflows

Workflow Definition

Describes the work to be done by associating buckets to workers. All input and output slots of workers must be associated to buckets. The types of buckets must match the required bucket types described in the worker definition.

A workflow run starts with the start-action. The order of the other actions is determined by their inputs and outputs.


Description of a workflow:

  • name
    • MANDATORY
    • The name of a workflow
  • parameters (MAP)
    • The parameters defined within this workflow as a map
  • startAction (MAP)
    • MANDATORY
    • the starting action of this workflow
    • there can be only one starting action
  • actions (LIST of MAPs)
    • the non-starting-actions of this workflow which

Description of an action

  • worker
    • the name of an existing worker definition
  • parameters
    • the parameters the workflow defines for this worker (not for the buckets the worker uses!)
  • input (MAP)
    • The mapping of the worker's named input slots (KEY) to an existing bucket definition (VALUE)
    • all of the worker's named input slots have to be resolved against an existing bucket of the same type.
  • output (MAP)
    • The mapping of the worker's named output slots (KEY) to an existing bucket definition (VALUE)
    • all of the worker's named output slots have to be resolved against an existing bucket of the same type.


!Parameters

We have two kinds of parameters in the workflow definition:

  • Global workflow parameters: Parameters that are set global for the whole workflow
  • Local worker parameters: Parameters that are set local in the workflow for a single worker
 
the local worker parameters do not affect the buckets the worker uses.
So the parameters used in the data object types are only resolved using global parameters (job or global workflow parameters).

Sample:

    {
      "name": "myWorkflow",
      "parameters": 
        {
            "myGlobalParam": "..."
        }
      "startAction":
        {
          "parameters": 
            {
               "myLocalParam": "..."
            }
          "worker": "myWorker",
       ...

Data object types and workers define parameter variables: ${...}

  • Needed data object type variables that are not set in a bucket parameter must be either set as workflow or job parameter.
  • Needed worker variables must be either set as workflow or job parameter.

Back to the top