Skip to main content
Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/DataObjectTypesAndBuckets"

(Monitor data object types)
m (non-forking workflows)
 
(28 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
= Buckets and Data Object Types =
 
= Buckets and Data Object Types =
 +
 +
 +
Please note that job manager element names (like buckets and data object types) must conform to the job manager naming convention:
 +
* names must inly consist of the following characters: <b>a-zA-Z._-</b>
 +
 +
If they do not conform, they won't be accessible in SMILA.
 +
* Pushing elements with invalid names will result in a 400 Bad Request,
 +
* predefined elements with invalid names won't be loaded, a warning will be logged in the SMILA.log file.
 +
E.g.
 +
<source lang="text">
 +
... WARN  ...  internal.DefinitionPersistenceImpl            - Error parsing predefined data object type definitions from configuration area
 +
org.eclipse.smila.common.exceptions.InvalidDefinitionException: Value 'record?store' in field 'name' is not valid: A name must match pattern ^[a-zA-Z0-9-_\.]+$.
 +
</source>
  
 
== Buckets  ==
 
== Buckets  ==
  
A bucket is a data container comprising logically grouped data objects that are to processed by some asynchronous workflows in SMILA. All data objects in a bucket are physically located in the same store and therefore share the same naming convention. For example, a data object can be a sequence of records ("record bulk") or an index. Also, the contents within one bucket have the same structure as is determined by its data object type. The actual data object types from which you can select when creating a bucket are predefined by the software and cannot be changed during runtime.  
+
A bucket is a data container comprising logically grouped data objects that are to processed by some asynchronous workflow in SMILA. All data objects in a bucket are physically located in the same store and therefore share the same naming convention. For example, data objects could be sequences of records (so called "record bulks") or indices. Also, the contents within one bucket have the same structure as is determined by its data object type. The actual data object types from which you can select when creating a bucket are predefined by the software and cannot be changed during runtime.
 +
 
 +
An important aspect of buckets is that they can be ''persistent'' or ''transient'': Objects in transient buckets are deleted automatically when a worker finishes a successful task ([[SMILA/Documentation/WorkerAndWorkflows#Non-forking_workflows|non-forking workflows]]) resp. when the workflow run has ended (forking workflows). Objects in persistent buckets survive until they are deleted explicitly or another workflow uses them. Whereas persistent buckets have to be created explicitly via the respective REST/JSON API call (see below) before they can be used in a workflow, transient ones are generated automatically by the system based on the definition of the respective workflow and need not and also cannot be created explicitly via this API. Similar, a store referenced by some transient bucket is created automatically by the Job Manager but a store referenced by a persistent bucket must be created beforehand.
  
An important aspect of buckets is that they can be persistent or transient: Objects in transient buckets are deleted automatically when the workflow run that created them has ended while objects in persistent buckets survive until they are deleted explicitly or another workflow uses them. Whereas persistent buckets have to be created explicitly via the respective REST/JSON API call (see below) before they can be used in a workflow, transient ones are generated automatically by the system based on the definition of the respective workflow and need not and also cannot be created explicitly via this API. Similar, a store referenced by some transient bucket is created automatically by the Job Manager but a store referenced by a persistent bucket must be created beforehand.
+
Persistent buckets can have [[SMILA/Documentation/JobParameters|parameters]] that are required for the referenced data object type or for the involved workers to operate when the bucket is referenced in a workflow. They can be set in the bucket definition itself, in the global section of the respective workflow definition, or later in the job definition.
  
Persistent buckets can have parameters that are required for the referenced data object type or for the involved workers to operate when the bucket is referenced in a workflow.
+
Buckets can have additional information (e.g. comments or additional layouting information for a configuration tool) apart from name, type or parameter. But a plain GET request will only display relevant information (i.e. relevant to the jobprocessing system). When you want to retrieve the additional info that is present in the json file or has been posted with the buckets, add <tt>?returnDetails=true</tt> as request parameter.
  
=== Monitor, create and modify buckets ===
+
=== List, create, and modify buckets ===
  
 
==== All buckets  ====
 
==== All buckets  ====
  
Use a GET request to retrieve monitoring information for all defined (persistent) buckets. Use POST to add new buckets.
+
Use a GET request to list all persistent buckets. Transient buckets are not shown in the list.
 +
 
 +
Use POST to add new persistent buckets or to edit them. Transient buckets cannot be created explicitly via this API.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
*GET: Returns the buckets information. If there are no buckets defined, you will get an empty list.
+
 
<pre>Only persistent buckets will be returned here, transient buckets generated dynamically for a workflow will not be returned.
+
*GET: Returns a list of all buckets. If there are no buckets defined, you will get an empty list.
</pre>
+
*POST: Add a new persistent bucket or edit an existing one. The bucket definition must at least contain the name and the data object type of the bucket. Bucket parameters are optional. If the bucket already exists, it will be updated after successful validation. However, the changes will not apply until the next job run, i.e. the current job run is not influenced by the changes.  
*POST: Add a new persistent bucket. The bucket definition must at least contain the name and the data object type of the bucket, additionally parameters may be set that are used in the data object type to build store and objects names in this bucket. You can create only buckets with data object types that contain a "persistent" definition. See below for a description of available data object types. If an already existing name is used, the bucket will be updated after successful validation. An actually running job will not be influenced. If an existing workflow uses a bucket of the same name, even if this bucket is optional and did not exist before, it will be updated for a new job run, too.
+
  
 
'''Usage:'''  
 
'''Usage:'''  
  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/buckets</nowiki></tt>.
+
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/buckets/</nowiki></tt>
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
Line 29: Line 45:
 
*Response status codes:  
 
*Response status codes:  
 
**200 OK: Upon successful execution (GET).
 
**200 OK: Upon successful execution (GET).
**201 CREATED: Upon successfull execution (POST). In case of success a HTTP return code is returned (followed by a JSON object containing the name and URI of the created bucket).
+
**201 CREATED: Upon successfull execution (POST). The result object returns a JSON object giving the name and URI of the created bucket.
**400 Bad Request: If the parameters in the bucket definition would result in incorrect store names, an HTTP 400 Bad Request is followed by an error in json format specifying which bucket and data object type are involved
+
**400 Bad Request: If the parameters in the bucket definition would result in incorrect store names or the bucket's name is invalid. The result object returns an error message in JSON format.
 +
 
 +
 
 +
'''Examples:'''
 +
 
 +
To list all buckets:
 +
 
 +
<pre>
 +
GET /smila/jobmanager/buckets/
 +
</pre>
 +
 
 +
The result would be:
 +
 
 +
<pre>
 +
HTTP/1.x 200 OK
 +
 
 +
{
 +
  "buckets":[
 +
      {
 +
        "name":"myBucket",
 +
        "url":"http://localhost:8080/smila/jobmanager/buckets/myBucket/"
 +
      },
 +
      {
 +
        "name":"myOtherBucket",
 +
        "url":"http://localhost:8080/smila/jobmanager/buckets/myOtherBucket/"
 +
      }
 +
  ]
 +
}
 +
</pre>
 +
 
 +
To create a bucket:
 +
 
 +
<pre>
 +
POST /smila/jobmanager/buckets/
 +
 
 +
{
 +
  "name": "myBucket",
 +
  "type": "recordBulks",
 +
  "comment": "A bucket I created all by myself.",
 +
  "parameters":
 +
  {
 +
    "store": "mystore"
 +
  }
 +
}
 +
</pre>
 +
 
 +
Note that this definition contains an unspecific "comment" field.
 +
 
 +
The result would be:
 +
 
 +
<pre>
 +
HTTP/1.x 201 CREATED
 +
 
 +
{
 +
  "name" : "myBucket",
 +
  "timestamp": "2011-08-15T10:53:42+0200",
 +
  "url" : "http://localhost:8080/smila/jobmanager/buckets/myBucket/"
 +
}
 +
</pre>
  
 
==== Specific buckets ====
 
==== Specific buckets ====
  
Use a GET request to retrieve information about a bucket with a given name. Use DELETE to delete a bucket with given name.
+
Use a GET request to get the definition of a bucket. Use DELETE to delete a bucket.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
*GET: Returns the information for the given bucket.  
+
 
*DELETE: Delete a bucket with the given bucket name. Buckets that are still used by an existing workflow cannot be deleted.
+
*GET: Returns the definition of the given bucket. When you want to retrieve additional info apart from name, type and parameters add returnDetails=true as request parameter.
 +
*DELETE: Deletes a bucket. Buckets cannot be deleted when their data object type does not have a definition for transient mode and the bucket is used in a current workflow definition.
 +
 
 
'''Usage:'''  
 
'''Usage:'''  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/buckets/<bucket-name></nowiki></tt>.
+
 
 +
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/buckets/<bucket-name>/</nowiki></tt>  
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
 
**DELETE
 
**DELETE
 
*Response status codes:  
 
*Response status codes:  
**200 OK: Upon successful execution (GET, DELETE). In case of DELETE with non existing bucket name, the call will be ignored.
+
**200 OK: Upon successful execution (GET, DELETE). When trying to delete a bucket that does not exist, the call will be ignored (DELETE) and 200 OK is returned nevertheless.
**404 Server error: If a wrong name is used, a HTTP 404 Server Error is followed by an error in json format (GET).
+
**404 Server error: In case <bucket-name> does not exist (GET).  
**400 Bad Request: If the bucket is referenced by an existing workflow an error will occur. If a bucket is predefined in the configuration it can't be removed.
+
**400 Bad Request: If the bucket is referenced by an existing workflow. Also, if a bucket is predefined in the configuration it cannot be removed.
  
== Data Object Types ==
+
'''Examples:'''
  
The data object type definition is provided with software and cannot be added at runtime. It defines default data object modes and default name patterns required by the software.
+
To get a bucket definition:
* The data object type definition should not be changed by the user. If something is wrong the job manager would start anyway but errors may occur later.
+
* Data object type use parameter variables ${...}. System parameter variables (name starts with _) are resolved automatically, values for other variables must be provided by higher-level definitions (buckets, workflows, jobs)
+
  
Data Object Types delivered by software:
+
<pre>
 +
GET /smila/jobmanager/buckets/myBucket/
 +
</pre>
 +
 
 +
The result would be:
  
 
<pre>
 
<pre>
{"dataObjectTypes": [
+
HTTP/1.x 200 OK
      {
+
 
      "name": "recordBulk",  
+
{
        "persistent": {
+
  "name" : "myBucket",
          "store": "${store}",
+
  "timestamp": "2011-08-15T11:55:00.482+0200",
          "object": "${_bucketName}/${_uuid}"
+
  "type" : "recordBulks",
        },
+
  "parameters" : {
        "transient": {
+
    "store" : "mystore"
          "store": "${tempStore}",
+
  }
          "object": "${_bucketName}/${_uuid}"
+
        }
+
      }
+
  ]
+
 
}
 
}
 
</pre>
 
</pre>
  
Notes:  
+
To get the complete bucket definition with additional data:  
* ${_uuid} means: new uuid is generated only when creating new bulk. When transforming an existing bulk, the uuid is reused.
+
  
The meaning of these definitions is:
+
<pre>
* recordBulk: a data object of this type contains a sequence of records. This type is the standard intermediate object type in workflows.
+
GET /smila/jobmanager/buckets/myBucket/?returnDetails=true
 +
</pre>
  
=== Monitor data object types ===
+
The result would now be:
 +
 
 +
<pre>
 +
HTTP/1.x 200 OK
 +
 
 +
{
 +
  "name" : "myBucket",
 +
  "timestamp": "2011-08-15T11:55:00.482+0200",
 +
  "type" : "recordBulks",
 +
  "comment": "A bucket I created all by myself.",
 +
  "parameters" : {
 +
    "store" : "mystore"
 +
  }
 +
}
 +
</pre>
 +
 
 +
To delete a bucket:
 +
 
 +
<pre>
 +
DELETE /smila/jobmanager/buckets/myBucket/
 +
</pre>
 +
 
 +
The result would be:
 +
 
 +
<pre>
 +
HTTP/1.x 200 OK
 +
</pre>
 +
 
 +
== Data Object Types ==
 +
 
 +
The definition of the data object types available in the system are provided with the software and cannot be added or changed during runtime.
 +
 
 +
They contain parameter variables denoted by "${...}". System parameter variables, names starting with "_" (underscore), are resolved automatically. Just parameters on root level with non complex values are used for resolving.
 +
Values for other variables must be set as a bucket parameter or a higher-level definition, e.g. as a workflow or job parameter. Where a type specifies both persistent and transient data, you will have to resolve only those parameter variables defined for the respective type.
 +
 
 +
Data Object Type definitions can have additional information (e.g. comments). But a plain GET request will only display relevant information (i.e. relevant to the jobprocessing system). When you want to retrieve the additional info that is present in the definitions of the json file, add returnDetails=true as request parameter.
 +
 
 +
=== List data object types ===
 
==== All data object types  ====
 
==== All data object types  ====
Use a GET request to retrieve information about all object data types.
+
 
 +
Use a GET request to retrieve information about all object data types. This API is read-only: You cannot add or modify data object types during runtime.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
  
*GET: Returns the information for all data object types.  
+
*GET: Returns a list of all data object types. To obtain additional information (if present) add returnDetails=true as request parameter.
  
 
'''Usage:'''  
 
'''Usage:'''  
  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/dataobjecttypes</nowiki></tt>.
+
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/dataobjecttypes/</nowiki></tt>  
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
 
*Response status codes:  
 
*Response status codes:  
**200 OK: Get information about all defined data object types.
+
**200 OK: Upon successful execution.
 +
 
 +
'''Examples:'''
 +
 
 +
To list all data object types:
 +
 
 +
<pre>
 +
GET /smila/jobmanager/dataobjecttypes/
 +
</pre>
 +
 
 +
The result would be:
 +
 
 +
<pre>
 +
HTTP/1.x 200 OK
 +
 
 +
{
 +
  "dataObjectTypes":[
 +
      {
 +
        "name":"recordBulks",
 +
        "url":"http://localhost:8080/smila/jobmanager/dataobjecttypes/recordBulks/"
 +
      }
 +
  ]
 +
}
 +
</pre>
  
 
==== Specific data object type ====
 
==== Specific data object type ====
Use a GET request to retrieve information about a specific object data types.
+
Use a GET request to retrieve information about a specific object data type.
  
 
'''Supported operations:'''  
 
'''Supported operations:'''  
  
*GET: Returns the information for a specific data object type.  
+
*GET: Returns the definition of a specific data object type.  
  
 
'''Usage:'''  
 
'''Usage:'''  
  
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/dataobjecttypes/<dataobjecttype-name> </nowiki></tt>.
+
*URL: <tt><nowiki>http://<hostname>:8080/smila/jobmanager/dataobjecttypes/<dataobjecttype-name>/</nowiki></tt>  
 
*Allowed methods:  
 
*Allowed methods:  
 
**GET
 
**GET
 
*Response status codes:  
 
*Response status codes:  
**200 OK: Get data object type definition.
+
**200 OK: Upon successful execution
 +
 
 +
'''Examples:'''
 +
 
 +
To get the definition of one data object type:
 +
 
 +
<pre>
 +
GET /smila/jobmanager/dataobjecttypes/recordBulks/
 +
</pre>
 +
 
 +
The result would be:
 +
 
 +
<pre>
 +
HTTP/1.x 200 OK
 +
 
 +
{
 +
  "name":"recordBulks",
 +
  "readOnly": true,
 +
  "persistent":{
 +
      "object":"${_bucketName}/${_uuid}",
 +
      "store":"${store}"
 +
  },
 +
  "transient":{
 +
      "object":"${_bucketName}/${_uuid}",
 +
      "store":"${tempStore}"
 +
  }
 +
}
 +
</pre>
 +
 
 +
As data object types cannot be defined using an API, but are pre-configured in the system configuration, they are all marked as "readOnly". See [[SMILA/Documentation/JobManagerConfiguration]] for details.
 +
 
 +
'''Available data object types:'''
 +
 
 +
Currently, there is only one data object type available, namely the type "recordBulks" (see its definition above).
 +
 
 +
The "recordBulk" type allows for sequences of records (record bulks). It is the standard intermediate object type in workflows, meaning there can be workers in a workflow that use objects of the "recordBulk" type as their input data and also workers that write objects of the same type as their result.
 +
 
 +
The "recordBulk" type allows both transient and persistent data. If a persistent bucket uses this type, one has to set the value of the <tt>${store}</tt> variable. Vice versa, when a transient bucket uses this type, one has to set the value of the <tt>${tempStore}</tt> variable. The variables <tt>${store}</tt> and <tt>${tempStore}</tt> define the name of the object store in which the respective data objects should be stored. They can be set in the bucket definition itself or either as a global workflow parameter or as a job parameter alternatively. However, they cannot be set as a local worker parameter (see [[SMILA/Documentation/JobParameters]]).
 +
 
 +
The system variable <tt>${_uuid}</tt> need not be set by the user. It is set automatically by the system. New <tt>uuid</tt>s are only generated when creating new bulks. When transforming existing bulks, they are resused.

Latest revision as of 06:24, 19 July 2013

Buckets and Data Object Types

Please note that job manager element names (like buckets and data object types) must conform to the job manager naming convention:

  • names must inly consist of the following characters: a-zA-Z._-

If they do not conform, they won't be accessible in SMILA.

  • Pushing elements with invalid names will result in a 400 Bad Request,
  • predefined elements with invalid names won't be loaded, a warning will be logged in the SMILA.log file.

E.g.

... WARN  ...  internal.DefinitionPersistenceImpl            - Error parsing predefined data object type definitions from configuration area
org.eclipse.smila.common.exceptions.InvalidDefinitionException: Value 'record?store' in field 'name' is not valid: A name must match pattern ^[a-zA-Z0-9-_\.]+$.

Buckets

A bucket is a data container comprising logically grouped data objects that are to processed by some asynchronous workflow in SMILA. All data objects in a bucket are physically located in the same store and therefore share the same naming convention. For example, data objects could be sequences of records (so called "record bulks") or indices. Also, the contents within one bucket have the same structure as is determined by its data object type. The actual data object types from which you can select when creating a bucket are predefined by the software and cannot be changed during runtime.

An important aspect of buckets is that they can be persistent or transient: Objects in transient buckets are deleted automatically when a worker finishes a successful task (non-forking workflows) resp. when the workflow run has ended (forking workflows). Objects in persistent buckets survive until they are deleted explicitly or another workflow uses them. Whereas persistent buckets have to be created explicitly via the respective REST/JSON API call (see below) before they can be used in a workflow, transient ones are generated automatically by the system based on the definition of the respective workflow and need not and also cannot be created explicitly via this API. Similar, a store referenced by some transient bucket is created automatically by the Job Manager but a store referenced by a persistent bucket must be created beforehand.

Persistent buckets can have parameters that are required for the referenced data object type or for the involved workers to operate when the bucket is referenced in a workflow. They can be set in the bucket definition itself, in the global section of the respective workflow definition, or later in the job definition.

Buckets can have additional information (e.g. comments or additional layouting information for a configuration tool) apart from name, type or parameter. But a plain GET request will only display relevant information (i.e. relevant to the jobprocessing system). When you want to retrieve the additional info that is present in the json file or has been posted with the buckets, add ?returnDetails=true as request parameter.

List, create, and modify buckets

All buckets

Use a GET request to list all persistent buckets. Transient buckets are not shown in the list.

Use POST to add new persistent buckets or to edit them. Transient buckets cannot be created explicitly via this API.

Supported operations:

  • GET: Returns a list of all buckets. If there are no buckets defined, you will get an empty list.
  • POST: Add a new persistent bucket or edit an existing one. The bucket definition must at least contain the name and the data object type of the bucket. Bucket parameters are optional. If the bucket already exists, it will be updated after successful validation. However, the changes will not apply until the next job run, i.e. the current job run is not influenced by the changes.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/buckets/
  • Allowed methods:
    • GET
    • POST
  • Response status codes:
    • 200 OK: Upon successful execution (GET).
    • 201 CREATED: Upon successfull execution (POST). The result object returns a JSON object giving the name and URI of the created bucket.
    • 400 Bad Request: If the parameters in the bucket definition would result in incorrect store names or the bucket's name is invalid. The result object returns an error message in JSON format.


Examples:

To list all buckets:

GET /smila/jobmanager/buckets/

The result would be:

HTTP/1.x 200 OK

{
   "buckets":[
      {
         "name":"myBucket",
         "url":"http://localhost:8080/smila/jobmanager/buckets/myBucket/"
      },
      {
         "name":"myOtherBucket",
         "url":"http://localhost:8080/smila/jobmanager/buckets/myOtherBucket/"
      }
   ]
}

To create a bucket:

POST /smila/jobmanager/buckets/

{
  "name": "myBucket",
  "type": "recordBulks",
  "comment": "A bucket I created all by myself.",
  "parameters": 
  {
     "store": "mystore"
  }
}

Note that this definition contains an unspecific "comment" field.

The result would be:

HTTP/1.x 201 CREATED

{
  "name" : "myBucket",
  "timestamp": "2011-08-15T10:53:42+0200",
  "url" : "http://localhost:8080/smila/jobmanager/buckets/myBucket/"
}

Specific buckets

Use a GET request to get the definition of a bucket. Use DELETE to delete a bucket.

Supported operations:

  • GET: Returns the definition of the given bucket. When you want to retrieve additional info apart from name, type and parameters add returnDetails=true as request parameter.
  • DELETE: Deletes a bucket. Buckets cannot be deleted when their data object type does not have a definition for transient mode and the bucket is used in a current workflow definition.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/buckets/<bucket-name>/
  • Allowed methods:
    • GET
    • DELETE
  • Response status codes:
    • 200 OK: Upon successful execution (GET, DELETE). When trying to delete a bucket that does not exist, the call will be ignored (DELETE) and 200 OK is returned nevertheless.
    • 404 Server error: In case <bucket-name> does not exist (GET).
    • 400 Bad Request: If the bucket is referenced by an existing workflow. Also, if a bucket is predefined in the configuration it cannot be removed.

Examples:

To get a bucket definition:

GET /smila/jobmanager/buckets/myBucket/

The result would be:

HTTP/1.x 200 OK

{
  "name" : "myBucket",
  "timestamp": "2011-08-15T11:55:00.482+0200",
  "type" : "recordBulks",
  "parameters" : {
    "store" : "mystore"
  }
}

To get the complete bucket definition with additional data:

GET /smila/jobmanager/buckets/myBucket/?returnDetails=true

The result would now be:

HTTP/1.x 200 OK

{
  "name" : "myBucket",
  "timestamp": "2011-08-15T11:55:00.482+0200",
  "type" : "recordBulks",
  "comment": "A bucket I created all by myself.",
  "parameters" : {
    "store" : "mystore"
  }
}

To delete a bucket:

DELETE /smila/jobmanager/buckets/myBucket/

The result would be:

HTTP/1.x 200 OK

Data Object Types

The definition of the data object types available in the system are provided with the software and cannot be added or changed during runtime.

They contain parameter variables denoted by "${...}". System parameter variables, names starting with "_" (underscore), are resolved automatically. Just parameters on root level with non complex values are used for resolving. Values for other variables must be set as a bucket parameter or a higher-level definition, e.g. as a workflow or job parameter. Where a type specifies both persistent and transient data, you will have to resolve only those parameter variables defined for the respective type.

Data Object Type definitions can have additional information (e.g. comments). But a plain GET request will only display relevant information (i.e. relevant to the jobprocessing system). When you want to retrieve the additional info that is present in the definitions of the json file, add returnDetails=true as request parameter.

List data object types

All data object types

Use a GET request to retrieve information about all object data types. This API is read-only: You cannot add or modify data object types during runtime.

Supported operations:

  • GET: Returns a list of all data object types. To obtain additional information (if present) add returnDetails=true as request parameter.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/dataobjecttypes/
  • Allowed methods:
    • GET
  • Response status codes:
    • 200 OK: Upon successful execution.

Examples:

To list all data object types:

GET /smila/jobmanager/dataobjecttypes/

The result would be:

HTTP/1.x 200 OK

{
   "dataObjectTypes":[
      {
         "name":"recordBulks",
         "url":"http://localhost:8080/smila/jobmanager/dataobjecttypes/recordBulks/"
      }
   ]
}

Specific data object type

Use a GET request to retrieve information about a specific object data type.

Supported operations:

  • GET: Returns the definition of a specific data object type.

Usage:

  • URL: http://<hostname>:8080/smila/jobmanager/dataobjecttypes/<dataobjecttype-name>/
  • Allowed methods:
    • GET
  • Response status codes:
    • 200 OK: Upon successful execution

Examples:

To get the definition of one data object type:

GET /smila/jobmanager/dataobjecttypes/recordBulks/

The result would be:

HTTP/1.x 200 OK

{
   "name":"recordBulks",
   "readOnly": true,
   "persistent":{
      "object":"${_bucketName}/${_uuid}",
      "store":"${store}"
   },
   "transient":{
      "object":"${_bucketName}/${_uuid}",
      "store":"${tempStore}"
   }
}

As data object types cannot be defined using an API, but are pre-configured in the system configuration, they are all marked as "readOnly". See SMILA/Documentation/JobManagerConfiguration for details.

Available data object types:

Currently, there is only one data object type available, namely the type "recordBulks" (see its definition above).

The "recordBulk" type allows for sequences of records (record bulks). It is the standard intermediate object type in workflows, meaning there can be workers in a workflow that use objects of the "recordBulk" type as their input data and also workers that write objects of the same type as their result.

The "recordBulk" type allows both transient and persistent data. If a persistent bucket uses this type, one has to set the value of the ${store} variable. Vice versa, when a transient bucket uses this type, one has to set the value of the ${tempStore} variable. The variables ${store} and ${tempStore} define the name of the object store in which the respective data objects should be stored. They can be set in the bucket definition itself or either as a global workflow parameter or as a job parameter alternatively. However, they cannot be set as a local worker parameter (see SMILA/Documentation/JobParameters).

The system variable ${_uuid} need not be set by the user. It is set automatically by the system. New uuids are only generated when creating new bulks. When transforming existing bulks, they are resused.

Back to the top