Difference between revisions of "SMILA/Documentation/Usage of Blackboard Service"

From Eclipsepedia

Jump to: navigation, search
m
(What is the blackboard?)
 
(16 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== What is Blackboard ==
+
== What is the blackboard? ==
Purpose of the Blackboard Service is management of SMILA record data during processing in SMILA component (Connectivity, Workflow Processor). Complete record data is stored only on a Blackboard which is not pushed through the workflow engine itself. Blackboard service hides handling of record persistence from the services.
+
Clients should generally manipulate records using Blackboard API methods in most cases so records will be completely under control of the Blackboard.
+
  
== Usage of Blackboard Service ==
+
The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the [[SMILA/Documentation/Binary Storage|BinaryStorage service]] to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a [[SMILA/Documentation/Record Storage|RecordStorage service]] on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).
  
=== Record lifecycle on the Blackboard ===
+
The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.
  
Record can be put onto Blackboard with one of the following operations:
+
== Blackboard Usage ==
* create(Id);
+
*: Creates a new record with a given Id. No data is loaded from persistence. If record with this Id already exists in the storages it will be overwritten when the created record will be committed. E.g. used by Connectivity to initialize the record from incoming data.
+
* load(Id);
+
*: Loads record data for the given Id from persistence. Used by a client to indicate that it wants to process this record.
+
* split(Id, String);
+
*: Creates a fragment of a given record, i.e. the record content is copied to a new Id derived from the given by adding a frament name (see Id Concept for details).
+
* setRecord(Record);
+
*: Puts record on the Blackboard, saves record attachments to BinStorage and replaces actual record attachments values with null.
+
* synchronize(Record);
+
*: Assumes that record with the same Id as of given record already exists on Blackboard or in storage. Loads record from the storage if needed and updates it's properties with properties of the given record.
+
  
Record is removed from the blackboard with one of these operations:
+
For pipelet programmers, using the blackboard is usually trivial:
* commit(Id);
+
* Use <tt>getRecord(id)</tt> or <tt>getMetadata(id)</tt> to get the record metadata. Modify the returned object to change record metadata.  
*: Saves record and attachments to storages and removes record from the Blackboard.
+
* To access record attachments, you should use the <tt>get/setAttachment</tt> methods of the blackboard. The <tt>Attachment</tt> objects of the <tt>Record</tt> object returned by <tt>getRecord(id)</tt> may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.
* invalidate(Id);
+
*: Record is removed from the Blackboard. If the record was created new (not overwritten) on the Blackboard it will be removed completely.
+
  
=== Attachments management ===
+
For more details see the javadoc of these interfaces:  
There are following methods for working with Record attachments in the Blackboard:
+
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/Blackboard.java Blackboard.java]
* setAttachment(id, name, byte[]);
+
* [https://dev.eclipse.org/svnroot/rt/org.eclipse.smila/trunk/core/org.eclipse.smila.blackboard/code/src/org/eclipse/smila/blackboard/BlackboardFactory.java BlackboardFactory.java]
* setAttachmentFromStream(id, name, InputStream);
+
* byte[] getAttachment(id, name);
+
* InputStream getAttachmentAsStream(id, name);
+
* boolean hasAttachment(id, name);
+
  
Attachments are not stored anywhere in the Blackboard, they are saved to BinStorage directly and the actual attachment value in the corresponding Record is replaced with {{null}}. It is highly recommended to use only Stream methods to manage attachments because loading the whole attachments in memory will cause great memory consumption and can be cause for application crash.
+
== Blackboard Notes ==
  
=== Usage of Blackboard Notes ===
+
Notes are  additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value.
Notes is additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record Id). Record specific notes are copied on record splits and removed when the associated record is removed from the Blackboard. Each Note has a String name and Serialaizable value.
+
 
There are following methods for working with Notes:
 
There are following methods for working with Notes:
* boolean hasGlobalNote(name);
+
* <tt>boolean hasGlobalNote(name);</tt>
* Serializable getGlobalNote(name);
+
* <tt>Serializable getGlobalNote(name);</tt>
* setGlobalNote(name, value);
+
* <tt>setGlobalNote(name, value);</tt>
* boolean hasRecordNote(id, name);
+
* <tt>boolean hasRecordNote(id, name);</tt>
* getRecordNote(id, name);
+
* <tt>getRecordNote(id, name);</tt>
* setRecordNote(id, name, value);
+
* <tt>setRecordNote(id, name, value);</tt>
  
 
[[Category:SMILA]]
 
[[Category:SMILA]]
 
=== Usage of Path with Blackboard methods ===
 
Some methods of Blackboard accept Path as an argument, for example getAttributeNames(Id, Path). Path represents the attribute path in the Record and is somewhat similar to XPath. String format of Path looks like "attributeName1[index1]/attributeName2[index2]/...". The specification of index is optional and defaults to 0. Index can refer to a literal or a sub-object that depends on methods getting the argument.
 
 
Consider the following example Record structure:
 
 
<source lang="xml">
 
<Record>
 
  <A n="AccessTreeExpanded">
 
    <O>
 
      <A n="account">
 
        <O>
 
          <A n="sub">
 
            <O>
 
              <A n="sid">
 
                <L>
 
                  <V>Value1</V>
 
                </L>
 
                <L>
 
                  <V>Value2</V>
 
                </L>
 
              </A>
 
            </O>
 
            <O>
 
              <A n="sid">
 
                <L>
 
                  <V>Value3</V>
 
                </L>
 
              </A>
 
            </O>
 
        </A>
 
        </O>
 
      </A>
 
    </O>
 
  </A>
 
</Record>
 
</source>
 
 
If you want to get the literal value from the first ''sid'' attribute the path will be following: ''AccessTreeExpanded[0]/account[0]/sub[0]/sid[0]''
 
Here indexes in each step except last mean the number of MObject inside the attribute, and the index of last step means the number of literal inside the attribute. That is, path for getting literal value from second ''sid'' attribute will be ''AccessTreeExpanded[0]/account[0]/sub[1]/sid[0]'' and path for getting the second literal of first ''sid'' attribute will be:'' AccessTreeExpanded[0]/account[0]/sub[0]/sid[1]''.
 

Latest revision as of 03:57, 9 January 2013

[edit] What is the blackboard?

The blackboard holds the records while they are pushed through a pipeline. Pipelets are invoked with a blackboard instance and a list of IDs of records to process. The pipelet can then access the blackboard to get record metadata and attachments. The blackboard hides the handling of record persistence from the services. For example it can be configured to hold only the record metadata in memory, but to put the attachments in the BinaryStorage service to save memory, if large attachments are used or many records are processed at the same time. Or it could get a record from a RecordStorage service on demand and write it back if processing is done (however, in SMILA 1.0 we do not use the RecordStorage by default anymore).

The blackboard instance is released after the pipeline execution has been finished, for each pipeline execution a new blackboard instance is created. If the blackboard has storages attached is the choice of the creating component and can be configurable there, or it depends on if storage services are active in the SMILA application. For the user of the blackboard (the pipelet, usually), it should be not relevant, if the blackboard has storages attached or not.

[edit] Blackboard Usage

For pipelet programmers, using the blackboard is usually trivial:

  • Use getRecord(id) or getMetadata(id) to get the record metadata. Modify the returned object to change record metadata.
  • To access record attachments, you should use the get/setAttachment methods of the blackboard. The Attachment objects of the Record object returned by getRecord(id) may not allow access to the attachment content, if the content has been swapped out to BinaryStorage. It's recommended to use streaming methods for attachments to keep memory consumption low.

For more details see the javadoc of these interfaces:

[edit] Blackboard Notes

Notes are additional temporary data created by pipelets to be used in later pipelets in the same workflow, but not to be persisted in the storages. Notes can be either global or record specific (associated with a record ID). Record specific notes are copied on record splits and removed when the associated record is removed from the blackboard. Each Note has a String name and Serializable value. There are following methods for working with Notes:

  • boolean hasGlobalNote(name);
  • Serializable getGlobalNote(name);
  • setGlobalNote(name, value);
  • boolean hasRecordNote(id, name);
  • getRecordNote(id, name);
  • setRecordNote(id, name, value);