Skip to main content

Notice: This Wiki is now read only and edits are no longer possible. Please see: for the plan.

Jump to: navigation, search

SMILA/Project Concepts/Components and Modules


Modulesare functional components or a placeholder for them.

Architecture Overview - Service Level.png


Technical proposal

{info} Note: This section may only be edited by assigned developer(s). His responsibility is also to reflect any agreed changes/details in discussion section.

Outdated information could be found at: SMILA/Project_Concepts/Outdated {info}


The connectivity module is the entry point for external data sources. The connectivity module is a single point of entry - on information level.

The connectivity module normalizes incoming information to an internally used message format. Large sets of incoming data could also be persisted into an external storage to reduce the queue load.

The connectivity module is also able to provide feedback to data source integration processes using a envelope mechanism. This envelope provides feedback to the IRM (e.g. whether data is up to date, incomplete, whether it was not able to take the data at all, ...). The data could also be exchanged in larger sets (multiple records at a time to reduce roundtrips). Additionally delta indexing support is provided allow high performance IRM based indexing.

Routing (assigning informations to information processes) and buffering functionality will be implemented within the connectivity module.

Currently no definitions regarding transport protocols are being made. We are sure that support for different communication protocols are of advantage.


The BPEL module is a part of the data flow process. In this area all information are annotated. Further actions could be executed. Services are the base for each operation. Sample Services:

  • Stellent
  • Ontologies
  • Language analysis
  • Rule engine
  • Creation of events (for publish and subscribe)

The BPEL module allows debugging in a controlled environment (e.g. UI).

The following integration steps will be performed:

  1. Creation of an optimized object model for java

The required know-how and abilities by the developer is kept during design into credit. The development know-how should be as low as possible. Core technologies should be at maximum Java, XML and OSGi.

  1. On a strategic level SCA support is important. The integration should allow an abstraction of the communication model between components that are used in the BPEL workflow. Its important that SCA provides us advantages in the area of instable Pipelets.
  2. Support of web services is a upside and no core functionality.

From Publish/Subscribe: It could be interesting to deliver a delta for the change notification (e.g. property XY has been modified) ? BPEL is document oriented. We have to take an view onto a definition what is a technical info (meta or control data) and what are real data. Does BPEL support variables?

The target architecture will do the whole message processing within the BPEL module. The change for making the XML message processor obsolete requires the introduction of two different configuration classes. In the new design "system processes" (e.g. splitting, filters, persistence, ...) and "data manipulation processes" (e.g. ontologies, file conversions, mime type detection, ...) are separated. This separation is a logical concept (not a technical one). For the end user these two processes must be distinguishable by the UI or by file extension.

Architecture Overview - BPEL.png

Data Storage

The data storage allows the storage and access to information. This information could be stored by different vehicles. Targeting version 1 of SMILA an XML based storage is taken into credit because this storage type allows easy lookup of information (search, access, change, ...).

Later SMILA versions could use distributed file systems. To optimize performance combined approaches could be taken into credit (e.g. small data to XML; large files to distributed storage).

To discuss: how to select data.

The data storage also contains further information per document (e.g. process or record related meta data). This information should support internal processes (e.g. delta indexing).

  • To discuss*: are different versions of a record interesting?
  • To discuss*: Binary handling (external \[probably distributed\] Storage... Security, ... Backup, ...)


The queue is currently the way to distributed load to the different nodes of the installation. Messages are inserted into the queue and listeners are resolving these messages from the queue when they have further processing capatibilities. These listeners assign the resolved messages to the appropriate data flow process.

Scheduler (P2)

The scheduler service is a component that controls reoccurring invocations of functionalities.

Each kind of functionality in SMILA should be callable by this service.

Sample activities:

  • Execute a BPEL process
  • Deletion of temporary data
  • Start backup process
  • Updating data
  • Remark*: Quartz, Java library like crontab [[1]]
  • Warning: Distributed installations*

Mashup (P2)

The mashup component is able to combine information from different storages. The goal is to create a new information level. (e.g. documents with extended author information) The new information is then sent to the connectivity module for further handling.

Communication (P2)

The communication module allows the access to SMILA enhanced data by other applications. That way annotated information could be accessed easily.

RSS and JMS Publish/Subscribe (P2)

The RSS functionality is implemented as a simple web application. The web application allows users to watch for modifications in collections or data storages.

The selection functionality, which is used to create a RSS feed is strongly related to the abilities of an data storage.

The implementation of the publish/subscribe message patterns is controlled by event configurations in BPEL processes.

Using this functionality its possible to send messages to any component (e.g. a queue or external components). Changes in information could be sent to interested parties by this functionality.

It could be interesting to deliver a delta for the change notification (e.g. property XY has been modified) ? BPEL is document oriented. We have to take an view onto a definition what is a technical info (meta or control data) and what are real data. Does BPEL support variables?

Configuration Manager (P2)

The configuration manager allows transparent access to configuration files. This service is also provided in a distributed installation.

Back to the top