Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/Usage of Blackboard Service"

m (Usage of Path with Blackboard methods)
(What is the blackboard?)
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== What is Blackboard ==
+
== What is the blackboard? ==
Purpose of the Blackboard Service is management of SMILA record data during processing in SMILA component (Connectivity, Workflow Processor). Complete record data is stored only on a Blackboard which is not pushed through the workflow engine itself. Blackboard service hides handling of record persistence from the services.
+
Clients should generally manipulate records using Blackboard API methods in most cases so records will be completely under control of the Blackboard.
+
  
== Usage of Blackboard Service ==
+
The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the [[SMILA/Documentation/Binary Storage|BinaryStorage service]] to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a [[SMILA/Documentation/Record Storage|RecordStorage service]] on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).
  
=== Record lifecycle on the Blackboard ===
+
The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.
  
Record can be put onto Blackboard with one of the following operations:
+
== Blackboard Usage ==
* create(Id);
+
*: Creates a new record with a given Id. No data is loaded from persistence. If record with this Id already exists in the storages it will be overwritten when the created record will be committed. E.g. used by Connectivity to initialize the record from incoming data.
+
* load(Id);
+
*: Loads record data for the given Id from persistence. Used by a client to indicate that it wants to process this record.
+
* split(Id, String);
+
*: Creates a fragment of a given record, i.e. the record content is copied to a new Id derived from the given by adding a frament name (see Id Concept for details).
+
* setRecord(Record);
+
*: Puts record on the Blackboard, saves record attachments to BinStorage and replaces actual record attachments values with null.
+
* synchronize(Record);
+
*: Assumes that record with the same Id as of given record already exists on Blackboard or in storage. Loads record from the storage if needed and updates it's properties with properties of the given record.
+
  
Record is removed from the blackboard with one of these operations:
+
For pipelet programmers, using the blackboard is usually trivial:
* commit(Id);
+
* Use <tt>getRecord(id)</tt> or <tt>getMetadata(id)</tt> to get the record metadata. Modify the returned object to change record metadata.  
*: Saves record and attachments to storages and removes record from the Blackboard.
+
* To access record attachments, you should use the <tt>get/setAttachment</tt> methods of the blackboard. The <tt>Attachment</tt> objects of the <tt>Record</tt> object returned by <tt>getRecord(id)</tt> may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.
* invalidate(Id);
+
*: Record is removed from the Blackboard. If the record was created new (not overwritten) on the Blackboard it will be removed completely.
+
  
=== Attachments management ===
+
For more details see the javadoc of these interfaces:  
There are following methods for working with Record attachments in the Blackboard:
+
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/Blackboard.java Blackboard.java]
* setAttachment(id, name, byte[]);
+
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/BlackboardFactory.java BlackboardFactory.java]
* setAttachmentFromStream(id, name, InputStream);
+
* byte[] getAttachment(id, name);
+
* InputStream getAttachmentAsStream(id, name);
+
* boolean hasAttachment(id, name);
+
  
Attachments are not stored anywhere in the Blackboard, they are saved to BinStorage directly and the actual attachment value in the corresponding Record is replaced with {{null}}. It is highly recommended to use only Stream methods to manage attachments because loading the whole attachments in memory will cause great memory consumption and can be cause for application crash.
+
== Blackboard Notes ==
  
=== Usage of Blackboard Notes ===
+
Notes are  additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value.
Notes is additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record Id). Record specific notes are copied on record splits and removed when the associated record is removed from the Blackboard. Each Note has a String name and Serialaizable value.
+
 
There are following methods for working with Notes:
 
There are following methods for working with Notes:
* boolean hasGlobalNote(name);
+
* <tt>boolean hasGlobalNote(name);</tt>
* Serializable getGlobalNote(name);
+
* <tt>Serializable getGlobalNote(name);</tt>
* setGlobalNote(name, value);
+
* <tt>setGlobalNote(name, value);</tt>
* boolean hasRecordNote(id, name);
+
* <tt>boolean hasRecordNote(id, name);</tt>
* getRecordNote(id, name);
+
* <tt>getRecordNote(id, name);</tt>
* setRecordNote(id, name, value);
+
* <tt>setRecordNote(id, name, value);</tt>
  
 
[[Category:SMILA]]
 
[[Category:SMILA]]
 
=== Usage of Path with Blackboard methods ===
 
Some methods of Blackboard accept Path as an argument, for example ''getAttributeNames(Id, Path)''. Path represents the attribute path in the Record. String format of Path looks like ''attributeName1[index1]/attributeName2[index2]/...''. The specification of index is optional and defaults to 0. Index can refer to a literal or a sub-object that depends on methods getting the argument.
 
 
Consider the following example Record structure:
 
 
<source lang="xml">
 
<Record>
 
  <A n="AccessTreeExpanded">
 
    <O>
 
      <A n="account">
 
        <O>
 
          <A n="sub">
 
            <O>
 
              <A n="sid">
 
                <L>
 
                  <V>Value1</V>
 
                </L>
 
                <L>
 
                  <V>Value2</V>
 
                </L>
 
              </A>
 
            </O>
 
            <O>
 
              <A n="sid">
 
                <L>
 
                  <V>Value3</V>
 
                </L>
 
              </A>
 
            </O>
 
        </A>
 
        </O>
 
      </A>
 
    </O>
 
  </A>
 
</Record>
 
</source>
 
 
The path to access first MObject (<O>) of the ''sub'' attribute is "''AccessTreeExpanded[0]/account[0]/sub[0]/''". Index in each step means the number of MObject inside the attribute. That is, to access second MObject of the ''sub'' attribute the path will be "''AccessTreeExpanded[0]/account[0]/sub[1]/''".
 
 
There are some cases when index of last step has a different meaning:
 
 
- in the ''getLiteral(Id, Path)'' method the index of last step means the number of literal inside the attribute. That is, path for accessing literal from ''sid'' attribute of second ''sub'' MObject (literal with value "Value3") will be "''AccessTreeExpanded[0]/account[0]/sub[1]/sid[0]''" and path for accessing second literal of ''sid'' attribute of first ''sub'' MObject (literal with value "Value2") will be: "'' AccessTreeExpanded[0]/account[0]/sub[0]/sid[1]''".
 
 
- in the ''getLiterals(Id, Path)'' method index of last step is irrelevant, that means this method will return all literals of the attribute found at the given path;
 
 
- in the ''setLiteral(Id, Path, Value)'' and ''addLiteral(Id, Path, Value)'' methods index of last step is irrelevant, that means that literal will be set or added to the attribute found on specified path
 
 
- in the methods that modify annotations to access root annotations of the record path should be null, "" (empty string), or empty Path
 

Revision as of 02:57, 9 January 2013

What is the blackboard?

The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the BinaryStorage service to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a RecordStorage service on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).

The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.

Blackboard Usage

For pipelet programmers, using the blackboard is usually trivial:

  • Use getRecord(id) or getMetadata(id) to get the record metadata. Modify the returned object to change record metadata.
  • To access record attachments, you should use the get/setAttachment methods of the blackboard. The Attachment objects of the Record object returned by getRecord(id) may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.

For more details see the javadoc of these interfaces:

Blackboard Notes

Notes are additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value. There are following methods for working with Notes:

  • boolean hasGlobalNote(name);
  • Serializable getGlobalNote(name);
  • setGlobalNote(name, value);
  • boolean hasRecordNote(id, name);
  • getRecordNote(id, name);
  • setRecordNote(id, name, value);