Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SMILA/Documentation/JobManager

< SMILA‎ | Documentation
Revision as of 07:30, 12 July 2011 by Nadine.auslaender.attensity.com (Talk | contribs) (Job Manager)

Note.png
Available since SMILA 0.9!


Job Manager

This page is work in progress.

The Job Manager controls the processing logic of asynchronous workflows in SMILA by regulating the Task Manager, which in turn generates tasks and decides which task should be processed by which worker and when.

Understanding the Entities of the Job Manager

The Job Manager handles the following entities:

  • Workflows
  • Workers
  • Buckets
  • Data object types
  • Jobs and job runs

The definition of an asynchronous workflow in SMILA consists of one or multiple steps, called actions. Each action specifies the worker that does the actual processing and connects the slots of the worker to actual buckets. A slot is a description of the worker's input and output behavior. There are input slots that describe the type of data objects that the worker is able to process and also output slots that define the type of data objects that the worker will produce. Hence, to be able to use a worker in a workflow, you will have to assign buckets of the correct type. A bucket is simply a logical data container grouping data objects of the same type. In addition to that, parameters may be set at different levels in a workflow to configure its behavior. Global parameters are valid for all actions in a workflow; specific parameters at the action level are only valid for the respective worker.

To run a certain workflow in SMILA, you will have to create a job definition that references the desired workflow. Like in workflows, it is possible to set further parameters, e.g. to adapt the used workflow for a certain application. For example, when using the same workflow in two jobs you could set different values for the store parameter in each to make sure that data is written to different places.

With a job definition, the system does not actually do anything. First, the job must be started to get a so called job run. Job runs in turn consist of one or multiple workflow runs. A workflow run refers to one traversal of the respective workflow, e.g. one traversal for each object in an input bucket.

Job runs provide so called job run data which can be used to monitor the current job processing. Also, job runs can be canceled in case of any problems. Except for the so-called "runOnce" jobs, which are finished automatically, job runs must be stopped manually when they should no longer react on changes.

Using the Job Manager

Back to the top