Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/HowTo/Howto integrate a component in SMILA"

(General review on content and style)
(General review on content and style)
Line 38: Line 38:
  
 
==== Simple: Integrating web services ====
 
==== Simple: Integrating web services ====
The simplest way of integrating additional functionality in SMILA is to call a web service. This is a standard BPEL workflow engine functionality independent of SMILA. However, there are some limitations concerning the input and result data to/from web services: In SMILA the workflow object (a DOM object) that enters the BPEL workflow contains by default only the [[SMILA/Glossary#I|Record IDs]]. [[SMILA/Glossary#R|Records]] and the data contained therin are NOT accessible from a BPEL workflow! The BPEL workflow can only access and use the values contained in the BPEL workflow object. It is possible to change this behaviour and add additional data to the workflow object by configuring filters in the configuration file <tt>org.eclipse.smila.blackboard/RecordFilters.xml</tt>. There you can select certain [[SMILA/Glossary#A|Attributes]] and [[SMILA/Glossary#A|Annotations]] that will be copied to the workflow object and so will be accessible by the BPEL workflow. [[SMILA/Glossary#A|Attachments]] are currently NOT supported, as binary data is not reasonable in DOM! Note that you also need to include all [[SMILA/Glossary#A|Attributes]] and [[SMILA/Glossary#A|Annotations]] in the <tt>RecordFilters.xml</tt> you want to write data to.
+
The simplest way of integrating additional functionality in SMILA is to call a web service, which is a standard BPEL workflow engine functionality independent of SMILA. However, there are some limitations concerning the input and result data to/from web services: The workflow object (a DOM object) that enters the BPEL workflow in SMILA contains only the record [[SMILA/Glossary#I|IDs]] by default. That means [[SMILA/Glossary#R|records]] and the data contained therein - [[SMILA/Glossary#A|attributes]], [[SMILA/Glossary#A|annotations]], and [[SMILA/Glossary#A|attachments]] - are '''not''' accessible from a BPEL workflow because it can only access and use the values contained in the BPEL workflow object.  
  
'''Examples''':
+
To overcome this restriction you can add additional data to the workflow object by adding filters in the configuration file located at <tt>org.eclipse.smila.blackboard/RecordFilters.xml</tt>. These filter rules define which [[SMILA/Glossary#A|attributes]] and [[SMILA/Glossary#A|annotations]] should be copied to the workflow object to make them accessible in the BPEL workflow. Additionally, you should not forget to include all attributes and annotations in the <tt>RecordFilters.xml</tt> file that you wish to write data to. Though filters work on attributes and annotations there is no possibility to access attachments of records because binary data is not reasonable in DOM.
* A good example for this use case is the integration of [[http://www.languageweaver.com/home.asp Language Weaver]]. The Language Weaver Translation Server provides a webservice interface that allows a text to be translated into another language. This service could be easily used within SMILA.
+
 
 +
===== Examples =====
 +
A good example for this use case is the integration of the [http://www.languageweaver.com/home.asp Language Weaver] web service. The Language Weaver Translation Server provides a web service interface that allows a text to be translated into another language. This service could easily be used within SMILA to extends its functionality.
 +
 
 +
===== Further reading =====
 +
Please consult the following how-to tutorials for a more detailed technical description:
  
Here are more detailed technical descriptions:
 
 
* [[SMILA/Development_Guidelines/How to filter and access record data in BPEL|How to filter and access record data in BPEL]]
 
* [[SMILA/Development_Guidelines/How to filter and access record data in BPEL|How to filter and access record data in BPEL]]
 
* [[SMILA/Development_Guidelines/How to integrate the HelloWorld webservice in BPEL|How to integrate the HelloWorld webservice in BPEL]]
 
* [[SMILA/Development_Guidelines/How to integrate the HelloWorld webservice in BPEL|How to integrate the HelloWorld webservice in BPEL]]
 
  
 
==== Default: Integrating local SMILA pipelets or processing services ====
 
==== Default: Integrating local SMILA pipelets or processing services ====
Line 62: Line 65:
 
* ...
 
* ...
  
 +
===== Examples =====
  
'''Examples''':
 
 
* Typical examples for [[SMILA/Glossary#P|Pipelets]] are the [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets.xmlprocessing|XML Processing Pipelets]]. These lightweight [[SMILA/Glossary#P|Pipelets]] are used for XML processing (e.g. XSL transformation). Each [[SMILA/Glossary#P|Pipeline]] uses it's own [[SMILA/Glossary#P|Pipelet]] instance.
 
* Typical examples for [[SMILA/Glossary#P|Pipelets]] are the [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets.xmlprocessing|XML Processing Pipelets]]. These lightweight [[SMILA/Glossary#P|Pipelets]] are used for XML processing (e.g. XSL transformation). Each [[SMILA/Glossary#P|Pipeline]] uses it's own [[SMILA/Glossary#P|Pipelet]] instance.
 
* A good example for a [[SMILA/Glossary#P|ProcessingService]] is the [[SMILA/Documentation/LuceneIndexService|LuceneIndexService]]. It provides functionality to index [[SMILA/Glossary#R|Records]] in Lucene indexes and can be used from multiple [[SMILA/Glossary#P|Pipelines]] in parallel.
 
* A good example for a [[SMILA/Glossary#P|ProcessingService]] is the [[SMILA/Documentation/LuceneIndexService|LuceneIndexService]]. It provides functionality to index [[SMILA/Glossary#R|Records]] in Lucene indexes and can be used from multiple [[SMILA/Glossary#P|Pipelines]] in parallel.
  
 +
===== Further reading =====
 
Here are more detailed technical descriptions:
 
Here are more detailed technical descriptions:
 
* [[SMILA/Development_Guidelines/How to write a Pipelet|How to write a Pipelet]]
 
* [[SMILA/Development_Guidelines/How to write a Pipelet|How to write a Pipelet]]
Line 77: Line 81:
 
== Integrating agents and crawlers ==
 
== Integrating agents and crawlers ==
  
SMILA's connectivity framework allows easy integration of additional datasources by providing implementations of [[SMILA/Glossary#A|Agents]] and/or [[SMILA/Glossary#C|Crawlers]].
+
SMILA's connectivity framework allows easy integration of additional data sources by providing implementations of [[SMILA/Glossary#A|Agents]] and/or [[SMILA/Glossary#C|Crawlers]].
 +
 
 +
=== Examples ===
 +
tbd.
 +
 
 +
=== Further reading ===
  
 
* [[SMILA/Development_Guidelines/How to implement an agent|How to implement an agent]]
 
* [[SMILA/Development_Guidelines/How to implement an agent|How to implement an agent]]
Line 87: Line 96:
 
More info comming soon ...
 
More info comming soon ...
  
'''Examples''':
+
=== Examples ===
* a typical example is an alternative implementation of the DeltaIndexingManager that does not store it's state in memory but in the filesystem or in a database
+
  
 +
A typical example is an alternative implementation of the DeltaIndexingManager that does not store it's state in memory but in the filesystem or in a database
 +
 +
=== Further reading ===
 +
tbd.
 
[[Category:SMILA]]
 
[[Category:SMILA]]

Revision as of 07:12, 30 September 2008

This page summarizes the different types and complexity levels for the integration of components in SMILA.

Introduction

Due to its architecture SMILA allows for the easy integration of third-party components into its framework. Actually there are three different possible integration scenarios available that are depicted in the following table.

Integrating services in BPEL Integrating agents and crawlers Integrating alternative implementations for SMILA core components
This is propably the most frequently used integration scenario. It allows for the integration or exchange of functionality (services, 3rd party software, etc.) used to process records in the workflow engine. Integrating your own crawler or agent implementations is another common scenario for adding functionality to SMILA. By doing so, further data sources can be unlocked to provide additional input to SMILA. This scenario is particularly intended for the experienced (SMILA) developer and comprises the possibility to exchange existing implementations of the SMILA core components by your own implementations.
Integrate-Service.png Integrate-Crawler.png Provide-Alternative-To-Core-Component.png
The figure demonstrates how you can integrate the functionality of your service or your piece of software to SMILA by adding it to the workflow engine. The figure above exemplary shows how you can add your own crawler implementation to SMILA. Please note that though you may also add an agent implementation likewise this option is not shown in the figure. This was chosen due to simplicity. The figure above demonstrates how two of the SMILA core components -- connectivity and data store -- may be exchanged by your own implementations. These components serve as examples only, that is, you may also exchange other core components such as the blackboard service or the delta indexing manager.
The above figures exemplary demonstrate at which levels in the SMILA architecture an integration of new components is applicable. However, for simplicity reason, we restricted the above figures to the index processing chain while completely ignoring the search processing chain that also offers similar integration options but is currently not in the focus of this page.

Integrating services in BPEL

As already shown in the overview above, SMILA offers the possibility to integrate your own service or piece of software into SMILA BPEL workflows. In SMILA we simply call these workflows pipelines. A pipeline is the definition of a BPEL process (or workflow) that orchestrates pipelets, processing services, and other BPEL services (e.g. web services).

There are several options on how to achieve this:

  • Simple: The easiest method to add functionality is to invoke a web service by using the standard functionality of BPEL. However, the disadvantage is that not all data of SMILA records are accessible if you opt for this method of integration.
  • Default: The recommended way to integrate additional functionality in SMILA is to provide Java implementations of two interfaces that allow for an easy creation of the above mentioned pipelets and processing services.
  • Advanced: This method extends the default mechanism by providing an alternative procedure for integrating processing services that do not run in the same OSGi runtime as the BPEL workflow but in another OSGI runtime that may even run on a remote machine.

Simple: Integrating web services

The simplest way of integrating additional functionality in SMILA is to call a web service, which is a standard BPEL workflow engine functionality independent of SMILA. However, there are some limitations concerning the input and result data to/from web services: The workflow object (a DOM object) that enters the BPEL workflow in SMILA contains only the record IDs by default. That means records and the data contained therein - attributes, annotations, and attachments - are not accessible from a BPEL workflow because it can only access and use the values contained in the BPEL workflow object.

To overcome this restriction you can add additional data to the workflow object by adding filters in the configuration file located at org.eclipse.smila.blackboard/RecordFilters.xml. These filter rules define which attributes and annotations should be copied to the workflow object to make them accessible in the BPEL workflow. Additionally, you should not forget to include all attributes and annotations in the RecordFilters.xml file that you wish to write data to. Though filters work on attributes and annotations there is no possibility to access attachments of records because binary data is not reasonable in DOM.

Examples

A good example for this use case is the integration of the Language Weaver web service. The Language Weaver Translation Server provides a web service interface that allows a text to be translated into another language. This service could easily be used within SMILA to extends its functionality.

Further reading

Please consult the following how-to tutorials for a more detailed technical description:

Default: Integrating local SMILA pipelets or processing services

The default technique to integrate functionality or software in SMILA is to write a Pipelet or ProcessingService that runs in the same OSGi runtime as the BPEL workflow engine. Pipelets are easier to implement than ProcessingServices, as they require only standard Java knowledge. For more information about Pipelets and ProcessingServices see Pipelets and ProcessingServices. Both Pipelets and ProcessingServices have full access to Records in SMILA via the BlackboardService. So it's easily possible to read, modify and store Records. In general Pipelets and ProcessingServices follow the same (sometimes optional) logical steps (of course this depends highly on the business logic to be executed). These steps are:

  • read the configuration (optional)
  • read input data from Blackboard (optional)
  • execute the business logic
  • write result data to Blackboard (optional)

In the part of your Pipelet/Processing service that implements the business logic you are totally free to use any desired technology. Some of the posibilities include

Examples
Further reading

Here are more detailed technical descriptions:

Advanced: Integrating remote SMILA processing services

tbd.

Integrating agents and crawlers

SMILA's connectivity framework allows easy integration of additional data sources by providing implementations of Agents and/or Crawlers.

Examples

tbd.

Further reading

Integrating alternative implementations for SMILA core components

SMILA's component based architecture even allows you to provide your own implementations of SMILA core components. More info comming soon ...

Examples

A typical example is an alternative implementation of the DeltaIndexingManager that does not store it's state in memory but in the filesystem or in a database

Further reading

tbd.

Back to the top