Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Triquetrum/Task-based processing

< Triquetrum
Revision as of 12:47, 1 March 2016 by Cxbrooks.gmail.com (Talk | contribs) (Further Reading: Fixed bib entries: Lecture Notes in Computer Science)

This is a work in progress...

Intro

The goal of the Task-based processing model is to be able to represent a work item, to trace the progress of its processing and to represent the obtained results.

And all of this in a model and with sufficiently detailed contents to support two essential goals :

  • facilitate the definition of service APIs and implementations to work with tasks in a process-oriented system
  • allow to store and consult full traces of each significant step in a process including timing, success and failure events etc.

TBD

Related Triquetrum components

Triquetrum provides an API for task-based processing and will also provide several implementation options. Currently the following is available:

  • org.eclipse.triquetrum.processing.api : contains the API of the Task-based model and of the main related services
  • org.eclipse.triquetrum.processing.service.impl : default/generic implementations of some of the services
  • org.eclipse.triquetrum.processing.model.impl.memory : a simple non-persistent implementation of the Task model, based on plain Java beans
  • org.eclipse.triquetrum.processing.test : unit tests for the Task model and processing services, also serving as code examples

The Task model

As the title implies, the model is built around the concept of a Task which is the core entity to represent an item of work that must be performed.

The definition of a work item in this model consists of the following main elements :

  • a task type: the main differentiator on the kind of work to be performed
  • a set of parameters a.k.a. attributes: these define the data needed to perform the desired 'variation' of the given task type
  • initial set of traceability data such as creation timestamp, the task's initiator, a correlation ID defined by the initiator, the initial task status etc.

After the creation of a Task with its attributes, an initiator will not perform the actual work itself but will hand it to a service broker that is responsible for finding a matching service implementation that is able to perform the actual processing (or to delegate it to further specialized processing systems externally or internally to the initiator's system). The selected service will be the executor of the task.

During the task submission and its processing, the task may change status several times. Such status changes, together with other potentially relevant processing events, are stored with timestamp and relevant data in an event stream with the Task as context.

When the task has been successfully processed, there will typically be some results available and the task will reach a final success status. Depending on the application domain and the concrete task, results may be simple success indicators, they may consist of large and/or complex data sets or anything in-between. Triquetrum offers a simple model to represent results as blocks of named values, linked to the task that produced them. Not every type of result is fit to be represented or stored in such a structure. Often the raw result data is stored externally, e.g. in files. In such cases the Triquetrum result items could be used to store paths or other references to the externally stored data.

A simplified UML class diagram shows the main elements of the Task model :

Triq processing Task-related classes.jpg

Core services

TBD

Triq processing services.jpg

Initiating a task

TBD

Create and submit a task.jpg

Further Reading

Task-based processing is an example of a coordination language, which composes separate computational elements together.

Collecting and storing processing events is a form of data provenance, where data sources and processes are documented so that an experiment may be reproduced. Kepler, which also uses Ptolemy II as an execution engine, supports data provenance.

Back to the top