Skip to main content
Jump to: navigation, search

SMILA/Project Concepts/Blackboard Service Restructured

< SMILA‎ | Project Concepts
Revision as of 09:29, 27 March 2009 by (Talk | contribs) (New page: == Restructuring the Blackboard Service == === Why all this? === While thinking about the fix for [ 269967] it occured to me that the ...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Restructuring the Blackboard Service

Why all this?

While thinking about the fix for 269967 it occured to me that the current blackboard service archtitecture is too simple. The problem there is that a workflow creates multiple records on the blackboard from the input object by splitting it. All of these records must be removed (after committing them to the storages in case of successful processing) after processing from the blackboard again to release the used memory again. This is easy in case of a single split in the pipeline, because then the listener knows all pipelet IDs: one input ID and the output IDs. But imagine that the pipeline splits the first split results again: then the IDs of the first splitting would not be returned to the listener and it could not care about committing/invalidating them. They would stay on the blackboard which results in a memory leak.

There are several ways to solve this:

  1. Let the pipelets splitting the record care about what to do with the source records: Apart from introducing a lot of error potential for badly implemented pipelets (and making pipelet programming harder in general), this is problematic because one pipelet usually cannot decide if other pipelets might still need the source records.
  2. Manage dependencies between reords on the blackboard so that all element/fragment records will also be removed when the source record is removed: Seems error prone, too: How to handle error cases? What if someone else already wants to process a split result when the source record is finished? Also, a record may be created in other ways than splitting so that no dependency is recorded.

And additionally, there may also be error cases which prevent a record to be correctly removed from the blackboard, so there is always a potential of memory leaks.

Next, I think the current design introduces problems with synchronous access to a single record from two seperate pipelines.

Finally, the Search service currently needs a blackboard implementation that does not persist the records. It uses an own implementation of the Blackboard interface that is not linked to any storage and just keeps everything in memory. This works but it would probably be nicer to have all blackboard stuff in a single place. And other service might have use for such a "transient" blackboard implementation, too.


The following proposal might solve these problems (or at least be a starting point):

  • Instead of a single Blackboard service we create a BlackboardFactory service. The factory is linked to binary and record storages optionally and runs as a Declarative Service.
  • The factory can create Blackboard instances which are either "transient" (pure in-memory implementation, not using any storages) or "persisting" (linked to binary storage and optionally to record storage). The client selects which kind of blackboard it wants to use. A persisting blackboard can only be created successfully, if at least a binary storage is known. Creation of transient blackboards is always possible.
  • For each "session" an own new blackboard instance is created that manages only those records worked on by this request. A session is for example:
    • a single task execution of a QueueWorker router (i.e. add/delete one record in Connectivity)
    • the processing initiated by a single message received by a QueueWorker listener (one input record and all records created by a workflow)
    • a single search request in the search service.
  • After the session the blackboard instance is released completely, thus freeing any memory resources automatically without interfering with other blackboard sessions.

New interfaces

 * Extension of existing BlackboardService interface,
 * but not a service (in the OSGi sense) anymore.
interface Blackboard {
  // this interface contains all methods of current BlackboardService interface, plus:
   * commit ALL records on this blackboard to storages (if any) and release resources
  void commit(); 
   * remove ALL records from blackboard and release all associated resources
  void invalidate();
interface BlackboardFactory {
   * create a new non-persisting blackboard instance. 
   * This method must always return a valid empty blackboard instance.
  Blackboard createTransientBlackboard();
   * create a blackboard able to persist records in storages
   * @throws BlackboardAccessException no persisting blackboard can be created, because 
   * not even a  binary storage service is available (record storage remains optional)
  Blackboard createPersistingBlackboard() throws BlackboardAccessException;

Impact on existing code

Most code could be left unchanged after this change (apart from renaming the current interface "BlackboardService" to simply "Blackboard" - and we could add a deprecated interface "BlackboardService" extending this new one for compatibilty), because the new blackboard interface has all current methods, too. Only the access to the blackboard in the QueueWorker and SearchService would have to be changed. And of course we would have to find places where the final commit() or invalidate() would be called. No changes are necessary in pipelets or processing services.

Further usage

  • The QueueWorker implementation is currently intented to support operation without a blackboard, too, by working directly with records. This could be changed to use a transient blackboard instead. I think this would make the QueueWorker (the TaskListExecution service especially) code a lot simpler: there are many conditions now to decide if a record must be synced to a blackboard etc. Some code in the QueueWorker would have to be refactored to use "Id" instead of "Record" in method signatures for this.

Later extensions

  • The createPersistingBlackboard() could be extended to support blackboard persisting into different Storage Points by adding a method parameter naming a storage point ID to use.
  • Maybe we can use the BlackboardFactory to add caching of records/attachments over multiple blackboard sessions.

Back to the top