Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/Usage of Blackboard Service"

(Creation and lifecycle of blackboard)
(What is the blackboard?)
 
(6 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
== What is the blackboard? ==
 
== What is the blackboard? ==
  
The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the [[SMILA/Documentation/Binary Storage|BinaryStorage service]] to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a [[SMILA/Documentation/Record Storage|RecordStorage service]] on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).
+
The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the [[SMILA/Documentation/Binary Storage|BinaryStorage service]] to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a [[SMILA/Documentation/Record Storage|RecordStorage service]] on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).  
  
== Creation and lifecycle of blackboard ==
+
The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.
  
For blackboard creation we use a BlackboardFactory service running as a declarative service.
+
== Blackboard Usage ==
* The factory can create blackboard instances which are either "transient" (pure in-memory implementation, not using any storages) or "persisting" (linked to binary storage and optionally to record storage). The client selects which kind of blackboard it wants to use. A persisting blackboard can only be created successfully, if at least a binary storage is known. Creation of transient blackboards is always possible.
+
* For each "session" an own new blackboard instance is created that manages only those records worked on by this request. A session is for example:
+
** a single task list execution of a QueueWorker router or listener (i.e. add/delete one record in Connectivity, or processing one input record from a queue message and manage all additional records created by the invoked workflows)
+
** a single search request in the search service.
+
* After the session the blackboard instance is released completely, thus freeing any memory resources automatically without interfering with other blackboard sessions.
+
  
=== Service interfaces ===
+
For pipelet programmers, using the blackboard is usually trivial:
 +
* Use <tt>getRecord(id)</tt> or <tt>getMetadata(id)</tt> to get the record metadata. Modify the returned object to change record metadata.
 +
* To access record attachments, you should use the <tt>get/setAttachment</tt> methods of the blackboard. The <tt>Attachment</tt> objects of the <tt>Record</tt> object returned by <tt>getRecord(id)</tt> may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.
  
see Repository:
+
For more details see the javadoc of these interfaces:  
 +
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/Blackboard.java Blackboard.java]
 
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/BlackboardFactory.java BlackboardFactory.java]
 
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/BlackboardFactory.java BlackboardFactory.java]
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/Blackboard.java Blackboard.java]
 
 
== Usage of blackboard ==
 
{{Note|Discussion| happens @ {{bug|336818}} }}
 
 
=== Record lifecycle on the blackboard ===
 
 
A record may be put on the blackboard using one of the following operations:
 
* <tt>create(String id);</tt>
 
*: Creates a new record with the given ''id''. No data is loaded from persistence. If a record with this ID already exists in the storages it will be overwritten when the created record will be committed. E.g. used by Connectivity to initialize the record from incoming data.
 
* <tt>load(String id);</tt>
 
*: Loads record data for the given Id from persistence. Used by a client to indicate that it wants to process this record.
 
* <tt>copyRecord(String id, String copyId);
 
*: Creates a new record using ''copyId'' and copies all metadata from the record with 'id' to it. Attachments are not copied.
 
* <tt>setRecord(Record);</tt>
 
*: Puts the record on the blackboard, saves record attachments to BinStorage, and replaces actual record attachments values by ''null''.
 
* <tt>synchronizeRecord(Record);</tt>
 
*: Assumes that a record with the same ID as the given one already exists on the blackboard or in the storage. Loads the record from the storage if needed and updates its properties with the properties of the given record.
 
 
A record may be removed from the blackboard using one of these operations:
 
* <tt>commit(String id);</tt>
 
*: Saves the record including its attachments to storages and removes the record from the blackboard.
 
* <tt>invalidate(String id);</tt>
 
*: Removes the record from the blackboard. If the record was created newly (not overwritten) on the blackboard it will be removed completely. The record is not removed from persistence, though.
 
* <tt>removeRecord(String id);</tt>
 
*: Removes a record completely from blackboard and all persistence layer.
 
 
=== Record access ===
 
 
* <tt>Record getRecord(String id);</tt>
 
*: Gets the record with its complete metadata. You should not rely on the record having all attachment values attached, only the names will be available. Use the blackboard attachments access methods to access attachments. If the record is not yet loaded in the blackboard, the PersistingBlackboard will return null, if it has a RecordStorage set and the record does not exist in record storage. If it doesn't have a RecordStorage a new record is created automatically. The TransientBlackboard creates a new record, too, if necessary. If you change the record metadata, you change the record stored on the blackboard, too, so you don't have to call setRecord() to write back the changes explicitly.
 
* <tt>AnyMap getMetadata(String id);</tt>
 
*: A shortcut for <tt>getRecord(id).getMetadata()</tt>
 
* <tt>Record getRecord(String id, String filterName);</tt>
 
*: Creates a copy of the record with only those metadata elements allowed by the named record filter. Changes done to the filtered metadata are not applied to the original record automatically.
 
* <tt>Record filterRecord(Record record, String filterName);</tt>
 
*: Applies a record filter known to the blackboard to a record which does not have to be loaded on the blackboard.
 
 
=== Attachments management ===
 
There are following methods for working with Record attachments in the blackboard:
 
* <tt>setAttachment(id, name, byte[]);</tt>
 
* <tt>setAttachmentFromStream(id, name, InputStream);</tt>
 
* <tt>byte[] getAttachment(id, name);</tt>
 
* <tt>InputStream getAttachmentAsStream(id, name);</tt>
 
* <tt>boolean hasAttachment(id, name);</tt>
 
  
Attachments are not stored anywhere in the blackboard, they are saved to BinStorage directly and the actual attachment value in the corresponding record is replaced by ''null''. It is highly recommended to use only <tt>Stream</tt> methods to manage attachments because loading the whole attachments in memory will cause great memory consumption and may cause application crashes.
+
== Blackboard Notes ==
  
=== Usage of blackboard notes ===
+
Notes are  additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value.
Notes are  additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serialaizable value.
+
 
There are following methods for working with Notes:
 
There are following methods for working with Notes:
 
* <tt>boolean hasGlobalNote(name);</tt>
 
* <tt>boolean hasGlobalNote(name);</tt>

Latest revision as of 03:57, 9 January 2013

What is the blackboard?

The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the BinaryStorage service to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a RecordStorage service on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).

The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.

Blackboard Usage

For pipelet programmers, using the blackboard is usually trivial:

  • Use getRecord(id) or getMetadata(id) to get the record metadata. Modify the returned object to change record metadata.
  • To access record attachments, you should use the get/setAttachment methods of the blackboard. The Attachment objects of the Record object returned by getRecord(id) may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.

For more details see the javadoc of these interfaces:

Blackboard Notes

Notes are additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value. There are following methods for working with Notes:

  • boolean hasGlobalNote(name);
  • Serializable getGlobalNote(name);
  • setGlobalNote(name, value);
  • boolean hasRecordNote(id, name);
  • getRecordNote(id, name);
  • setRecordNote(id, name, value);