SMILA/Documentation/HowTo/Howto integrate a component in SMILA

This page summarizes the different types and complexity levels of integration of components in SMILA.

Integration of Services in BPEL

There are several options on how to integrate new functionality in SMILA BPEL workflows.

Simple: webservices

The simplest way of integrating additional functionality in SMILA is to call a webservice. This is a standard BPEL workflow engine functionality independent of SMILA. There are some limitations concerning the input und result data to/from webservices. Records are NOT accessible from a BPEL workflow! It is only possible to use the values contained in the BPEL workflow object. By default the workflow object contains only the ID of a record. Though it is possible to include Attributes and Annotations of Records in the workflow object, Attachments are not supported. The content of the workflow object can be configured by filters in the configuration file org.eclipse.eilf.blackboard/RecordFilters.xml. You need to include Attributes/Attachments used for input as well as for output.

Examples:

A good example for this use case is the integration of [Language Weaver]. The Language Weaver Translation Server provides a webservice interface that allows a text to be translated into another language. This service could be easily used within SMILA.

Here are more detailed technical descriptions:

How to integrate a webservice in a SMILA BPEL pipeline

Default: local SMILA Pipelet or ProcessingService

The default technique to integrate functionality or software in SMILA is to write a Pipelet or ProcessingService that runs in the same OSGi runtime as the BPEL workflow engine. The difference between Pipelets and ProcessingServices is that a Pipelet's lifecycle and configuration is managed by the workflow engine whereas ProcessingServices are not instantiated and configured by the workflow engine. They are implemented as OSGi services (preferably Declarative Services) and managed independently from the workflow engine, by the OSGi runtime. Thus a ProcessingServices instance can be used by multiple workflows and has to be thread safe. Unlike Pipelets, which are not shared by multiple workflows. Each occurrence of a pipelet in a workflow uses a separate pipelet instance. Pipelets are easier to implement than ProcessingServices, as there is no need to worry about multithreaded access and no OSGi know how is needed.

Both Pipelets and ProcessingServices have full access to the Records in SMILA via the BlackboardService. So it's easily possible to read, modify and store Records. In general Pipelets and ProcessingServices follow the same (sometimes optional) logical steps (of course this depends highly on the business logic to be executed). These steps are:

read the configuration (optional)
read input data from Blackboard (optional)
execute the business logic
write result data to Blackboard (optional)

In the part of your Pipelet/Processing service that implements the business logic you are totally free to use any desired technology. Some of the posibilities include

use of POJOs (e.g. see the various XML Processing Pipelets)
use any local available OSGi service, even other ProcessingServices (e.g. see the AperturePipelet that uses the ApertureMimeTypeIdentifier)
use other technologies like JNI, RMI, Corba, etc. to integrate remote or non Java components (e.g. integration of Oracle Outside In Technology)
...

Examples:

Typical examples for Pipelets are the XML Processing Pipelets. These lightweight Pipelets are used for XML processing (e.g. XSL transformation). Each pipeline uses it's own Pipelet instance.
A good example for a ProcessingService is the LuceneIndexService. It provides functionality to index Records in Lucene indexes and can be used from multiple pipelines in parallel.

Here are more detailed technical descriptions:

Advanced: remote SMILA ProcessingService

tbd.

Integration of Agents and Crawlers

SMILA's Connectivity Framework allows easy integration of additional datasources by providing implementations of Agents and/or Crawlers.

How to implement an Agent
How to implement a Crawler

Providing alternative implementations for SMILA Core-Components

SMILa's component based architecture even allows you to provide your own implementations of SMILA core components. More info comming soon ...

Examples:

a typical example is an alternative implementation of the DeltaIndexingManager that does not store it's state in memory but in the filesystem or in a database

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

SMILA/Documentation/HowTo/Howto integrate a component in SMILA

Contents

Integration of Services in BPEL

Simple: webservices

Default: local SMILA Pipelet or ProcessingService

Advanced: remote SMILA ProcessingService

Integration of Agents and Crawlers

Providing alternative implementations for SMILA Core-Components

Breadcrumbs

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

SMILA/Documentation/HowTo/Howto integrate a component in SMILA

Contents

Integration of Services in BPEL

Simple: webservices

Default: local SMILA Pipelet or ProcessingService

Advanced: remote SMILA ProcessingService

Integration of Agents and Crawlers

Providing alternative implementations for SMILA Core-Components