Difference between revisions of "SMILA/Documentation/HowTo/How to write a Worker"

From Eclipsepedia

Jump to: navigation, search
(Worker definition)
(Workflow definition)
Line 145: Line 145:
 
</pre>
 
</pre>
  
== Workflow definition ==
+
=== Workflow definition ===
  
 
To use your worker in a workflow you have to add a new workflow or change an existing one. You can either use the jobmanager API to add a workflow definition to the running system, or you can edit <tt>workflows.json</tt> from <tt><WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager</tt> folder and add/change a workflow.  
 
To use your worker in a workflow you have to add a new workflow or change an existing one. You can either use the jobmanager API to add a workflow definition to the running system, or you can edit <tt>workflows.json</tt> from <tt><WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager</tt> folder and add/change a workflow.  

Revision as of 07:48, 26 September 2011


WORK IN PROGRESS

This how-to describes the necessary steps for writing a Worker in SMILA.

Contents

Preconditions

  • Your integration environment must be set up, see http://wiki.eclipse.org/SMILA/Development_Guidelines/How_to_set_up_integration_environment;
  • You should have read and understood the JobManager, especially the configuration of workers and workflows if you want to create new workers.
  • You should have at least an idea about the OSGi framework and OSGi services. For links to introductory articles and tutorials see [1]. For a quite comprehensive overview on OSGi see [2]. Especially, SMILA makes intensive use of OSGi's Declarative Services facility, so you may want to have at least a quick look at it.

Project templates

If you have performed the steps from http://wiki.eclipse.org/SMILA/Development_Guidelines/How_to_set_up_integration_environment your workspace should contain four projects:

  • org.eclipse.smila.integration.worker: template bundle for worker development, containg an example worker class
  • org.eclipse.smila.integration.worker.test: template bundle to test a developed worker, containg an example test class
  • org.eclipse.smila.integration.feature: feature definition to group together all custom bundles. This will be convenient when you have more than one custom bundle.
  • SMILA.application: template bundle to launch the whole SMILA application, containing the developed worker

Running

You should now test your workspace setup to make sure that everything works with the prepared stuff.

Run the application

In the menu, open "Run" -> "Run Configurations" or "Debug Configurations" and you should find an entry "OSGi Frameworks" -> "SMILA". Select it and click "Run" or "Debug" and SMILA should start just like when started from the command line. The configuration of this SMILA instance is in your workspace in "SMILA.application/configuration". When starting the SMILA.launch in eclipse, you should see something like the following output in the console window:

...
Added worker HelloWorldWorker to WorkerManager.
...

You should also be able to read the worker definition using the jobmanager HTTP API now: Go to http://localhost:8080/smila/jobmanager/workers/ to see something like this:

{
  "workers" : [ ...,
     {
       "name" : "HelloWorldWorker",
       "url" : "http://localhost:8080/smila/jobmanager/workers/HelloWorldWorker/"
     }, ... ] 
}

You can now click on the link to the worker description and you should see the description of the HelloWorldWorker:

{
  "name" : "HelloWorldWorker",
  "input" : [ {
    "name" : "inputRecords",
    "type" : "recordBulks"
  } ],
  "output" : [ {
    "name" : "outputRecords",
    "type" : "recordBulks"
  } ]
}

Run the test case

Please be sure that you have stopped the SMILA.launch. To run the JUnit test case for the HelloWorldWorker, open Run -> Run Configurations again, select JUnit Plugin Test -> TestHelloWorldWorker

TestHelloWorldWorker: Value of attribute 'greeting' = 'HelloWorldWorker was here :-)'

This shows that the HelloWorldWorker has done something. Of course, the test also contains an assertion so that it really would fail, if the attribute did not have the expceted value.

Create your own worker

Use template

The easiest way to create a new worker is by implementing it in the bundle org.eclipse.smila.integration.worker contained in the SDK. There you can just place your new worker beside the HelloWorldWorker example worker, or replace it. For easy development you should not change the bundle/package names. Things you have to do when renaming the bundle/package or creating your own worker bundle are described later on.

Bundle dependencies

The dependencies of the bundle are managed by the OSGi framework, so they have to be explicitly configured in the MANIFEST.MF file so the OSGi framework can solve the dependencies (in the correct versions) when the services are started.

To create a worker that reads and writes Records, we need at least the following bundles imported as packages (see META-INF/MANIFEST.MF Dependencies/Imported Packages):

  • org.eclipse.smila.objectstore (Possible exceptions when accessing input/output streams)
  • org.eclipse.smila.taskworker (The TaskWorker bundle containing the Worker and TaskContext interfaces).
  • org.eclipse.smila.taskworker.input (Input streams of the TaskWorker bundle)
  • org.eclipse.smila.taskworker.output (Output streams of the TaskWorker bundle)
  • org.eclipse.smila.datamodel (for the Record class)

This is already configured. If access to other packages is needed, just extend the MANIFEST.MF Imported Packages section.

Worker Implementation Java Class

Create a worker class which implements org.eclipse.smila.taskworker.Worker. Have a look at the example worker org.eclipse.smila.integration.worker.HelloWorldWorker that comes with the SDK in the org.eclipse.smila.integration.worker bundle. You must implement two methods:

  • getName() must return a unique name for your worker. Exactly the same name (case sensitive) must be used later in the worker descriptions and workflow definitions.
  • perform() does the actual work. It is called with a TaskContext object that provides access to the task properties, input and output objects, and counters.

OSGI Declarative Service

Every worker must be declared as an OSGi Declarative Service (DS) in order to be registred properly to the worker framework. To configure your worker as DS you have to put an appropriate xml file in the folder <WORKSPACE>/bundles/org.eclipse.smila.integration.worker/OSGI-INF.\\

The file can be created with the Component Definition wizard.

Have a look at helloworldworker.xml as an example:

<?xml version="1.0" encoding="UTF-8"?>
<scr:component xmlns:scr="http://www.osgi.org/xmlns/scr/v1.1.0" name="HelloWorldWorker" immediate="true">
    <implementation class="org.eclipse.smila.integration.worker.HelloWorldWorker" />                           
    <service>
       <provide interface="org.eclipse.smila.taskworker.Worker"/>
    </service>                  
</scr:component>

The file describes the interface that the worker has to implement (and through which it will be accessed in the OSGi application by means of dependency injection), which class is the concrete implementor of that interface, which services it references (our simple worker does not reference any, you can find a description later on) and which name the service has.

To describe your own worker you can just create a copy of the OSGI-INF/helloworldworker.xml file in the same directory. Then change at least the "name" attribute in the root element and the "class" element in the "implementation" element.


When you don't need the HelloWorldWorker anymore you may want to remove at least it's component definition file from the bundle. Otherwise, it will always be running and asking for tasks in the final deployment. While it should not really be a problem, it causes some unnecessary overhead that can be easily be avoided.

Activate the Worker: config.ini file

This file describes which OSGi services are automatically started and defines the start order.\\ You can find this file in: <WORKSPACE>/SMILA.application/configuration \\

Check that the custom bundle is added with an appropriate start level (level 4 is usually fine). One of the latest lines should look like this:

org.eclipse.smila.integration.worker@4:start, \


Register your worker in jobmanager configuration

These are the steps to use your new worker with the jobmanager framework.

Worker definition

Edit workers.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add the definition for the new worker.

Important: The name in the worker definition has to be the same that is returned by the getName() method in the worker implementation!

For the example worker HelloWorldWorker we want to use one input and output slot. And we use recordBulks as data object type cause we want to modify (bulks of) records with this worker:

{ "name": "HelloWorldWorker",
  "input": [ 
         {  "name": "inputRecords",
            "type": "recordBulks"
         } ],
  "output": [ 
         {  "name": "outputRecords",
            "type": "recordBulks"
         } ]
}

Workflow definition

To use your worker in a workflow you have to add a new workflow or change an existing one. You can either use the jobmanager API to add a workflow definition to the running system, or you can edit workflows.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add/change a workflow.

This example is a test workflow that uses the HelloWorldWorker to manipulate all records which where pushed into the system using the bulkbuilder. Because it's pretty useless as such, we did not add it to SMILA.application/configuration/org.eclipse.smila.jobmanager/workflows.json, but it's used in the unit test bundle org.eclipse.smila.integration.worker.test: The test case reads the output bulk created by the HelloWorldWorker to check if it been running.

{
   "name":"HelloWorldWorkflow",
   "startAction":{
      "worker":"smilaBulkbuilder",
      "output":{
         "insertedRecords":"importBucket"
      }
   },
   "actions":[
      {
         "worker":"HelloWorldWorker",
         "input":{
            "inputRecords":"importBucket"
         },
         "output":{
            "outputRecords":"helloWorldExportBucket"
         }
      }
   ]
}

Bucket definition

If you want to use a new persistent bucket for your workflow (see jobmanager documentation) you have to add it via the jobmanager API or add it to the configuration: Edit buckets.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and create desired bucket.

Here's an example from the test bundle org.eclipse.smila.integration.worker.test for the workflow above that makes the final bucket helloWorldExportBucket persistent. For the unit test, the output bucket of the worker must be persistent so that the test case can still read the result records when the workflow has ended. Otherwise the jobmanager would remove the transient object immediately after the HelloWorldWorker has finished.

{
   "name":"helloWorldExportBucket",
   "type":"recordBulks"
}

Testing

Use the launcher

If everything was done correctly and you start the ISMILA.launch in Eclipse, you should see something like the following output in the console window, but with the name of your own worker:

...
Added worker HelloWorldWorker to WorkerManager.
...

You should also be able to read your worker definition using the jobmanager HTTP API now: Go to http://localhost:8080/smila/jobmanager/workers/ to see something like this:

{
  "workers" : [ ...,
     {
       "name" : "HelloWorldWorker",
       "url" : "http://localhost:8080/smila/jobmanager/workers/HelloWorldWorker/"
     }, ... ] 
}

and click on the link to the worker description and you should see the description you added to workers.json earlier.


Create worker unit test

You can use the test bundle template org.eclipse.smila.integration.worker.test to add a test for your worker. Have a look at the example test class org.eclipse.smila.integration.worker.test.TestHelloWorldWorker that comes with the SDK.

All configuration files for the test are in org.eclipse.smila.integration.worker.test/configuration. This is similar to SMILA.application/configuration, but contains only the configuration files necessary to run the tests, not all files needed by a complete system. Also, some configuration files may differ from those in SMILA.application, e.g. some components may be configured with smaller limits to make tests run quicker. However, if you create a new worker, you must add its description to the workers.json in the test bundles and define persistent buckets and workflows required to run the test. Additionally make sure that the config.ini contains the names of your worker bundles and those of services your worker needs to access.

To start the test in eclipse use the test bundle's launch configuration that comes with the SDK. In Eclipse: Run Configurations/JUnit Plug-in Test/org.eclipse.smila.integration.worker.test This runs all test cases in package org.eclipse.smila.integration.worker.test. You can use the dialog to restrict to only the test cases you really need.

Manually installing the worker in SMILA

In the following we describe the steps to deploy your worker to an existing SMILA installation.

Create a feature project

A feature project is a container project that defines the Plug-ins needed for a specific feature. In our case our feature is to provide a worker, so we'll only have one Plug-in included in that feature, but it can also be reasonable to include all worker Plug-ins that are necessary to extend the SMILA to be able to handle a specific scenario in one feature that can be deployed and so includes all plugins necessary. The SDK contains already a prepared project org.eclipse.smila.integration.feature that includes the custom worker bundle. If you create further worker bundles (or other SMILA extensions) you can just add them to this feature (see below).

If you ever need to create an own feature project you can use Eclipse's New... wizard:

  • New --> Plug-in Development --> Feature Project
    • Enter a Project name
    • Version e.g. 1.0.0 (should match the version of your plug-in)
    • Fill in other feature properties to describe the new feature
  • Next
    • select your worker plugin
  • Finish

Deploy your features

Now it's easy to export your custom bundles to files that can be easily deployed into SMILA:

  • Select your feature project
  • Right-click on it
  • Click on Export...
  • Select Plug-in Development --> Deployable features
  • Next
  • Select your new worker feature(s)
  • Select a destination folder. If you are re-exporting after changes (especially after renames), you should first delete the destination folder.
  • Click Finish

After that you will find plugins and features directories in your destination directory that contain the deployable software. The export process produces two additional files artifacts.jar and contents.jar which are not for our purposes.

Install your worker feature in a local copy of SMILA

  • Copy the features and plugins folder to your SMILA installation.
  • merge your configuration changes (e.g. configuration/org.eclipse.smila.jobmanager) into the SMILA configuration
    • copy your configuration/config.ini file (see above) or edit the installed config.ini directly to start up your bundle
      • e.g. for the above bundle and version this would be (in the second last line): org.eclipse.smila.integration.worker@4:start, \
  • start your system
  • In data/log/smila.log you should now find such a line:
...
2011-06-06 15:17:15,035 INFO  [Component Resolve Thread (Bundle 5)          ]  internal.WorkerManagerImpl                    - Added worker HelloWorldWorker to WorkerManager.
...

Of course, additionally you should be able to retrieve the worker description you added to configuration/org.eclipse.smila.jobmanager/workers.json via the JobManager REST API.

Manual installation on the SMILA file repository

After testing the worker thoroughly on your system, you can make the above steps on your SMILA file repository to be able to distribute the modified SMILA containing the new worker to all systems in a cluster.

The location of the repository can be found by listing your software.ini file. In the repository you can find a SMILA directory. Apply the changes described in the section above to this directory.


Advanced How To's

How to access another OSGi Service inside your Worker

With SMILA there come a lot of components with APIs for different purposes. Sometimes you may want to access such an API inside your worker. With the concept of OSGi Declarative Services (DS) this is just a matter of configuration.

Example: Reading all cluster nodes

Assumed, we want to know the names of all cluster nodes in our worker. This is possible via ClusterConfigService API. Here are the steps to access this API in your worker:

  • Precondition: We assume you already configured your worker as OSGi Declarative Service as described before.
  • To use the ClusterConfigService you have to import the appropriate package org.eclipse.smila.clusterconfig in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")
  • Configure ClusterConfigService as referenced service in the service description xml (OSGI-INF/...):
<?xml version="1.0" encoding="UTF-8"?>
<scr:component org.eclipse.smila.jobmanager name="MyWorker" immediate="true">
    <implementation class="mypackage.MyWorkerImpl" />
    <service>
       <provide interface="org.eclipse.smila.taskworker.Worker"/>
    </service>        
    <reference bind="setClusterConfigService"
               cardinality="1..1"
               interface="org.eclipse.smila.clusterconfig.ClusterConfigService"
               name="ClusterConfigService"
               policy="static"
               unbind="unsetClusterConfigService"/>
</scr:component>
  • Implement the specifed methods setClusterConfigService and unsetClusterConfigService in your worker implementation. This may look like this:
  private ClusterConfigService _ccs;

  public void setClusterConfigService(final ClusterConfigService ccs) {
    _ccs = ccs;
  }

  public void unsetClusterConfigService(final ClusterConfigService ccs) {
    if (_ccs == ccs) {
      _ccs = null;
    }
  }
  • Now, the OSGi framework will automatically set the NCSService (which implements the interface ClusterConfigService) in your worker at startup via the specified method. So the ClusterConfigService API will be accessible at runtime:
   ...
   List<String> clusterNodes = _ccs.getClusterNodes();
   ...


How to add / access a configuration for your Worker

You can add a worker configuration, e.g. a property file, by adding it to the application configuration.

Example: Adding a property file "myWorker.properties" and access it in the worker

  • To add a worker configuration create an appropriate folder in the application configuration and place the property file there:
  SMILA.application/configuration/MY_BUNDLE_NAME/myWorker.properties 
  • To easiest way to access the configuration in your worker is via org.eclipse.smila.utils.config.ConfigUtils class
  • To use this class you have to import the appropriate package org.eclipse.smila.utils.config in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")
    • For the following example code you should also import org.apache.commons.io
  • Your code could look somehow like that:
   
    InputStream configFileStream = null;
    try {
      configFileStream = ConfigUtils.getConfigStream(MY_BUNDLE_NAME, myWorker.properties);
      Properties props = new Properties();
      props.load(configFileStream);
      ...      
    } finally {
      if (configFileStream != null) {
        IOUtils.closeQuietly(configFileStream);
      }
    }

Add on: Read configuration at startup

  • If you want to initialize your worker by configuration at startup, you can use the activate() method automatically called by the OSGi framework at bundle startup.
  • To use an activate method you have to import the package org.osgi.service.component in the MANIFEST.MF.
  • Then your code could look like that
 protected void activate(final ComponentContext context) {
    try {
      readConfiguration();
      ...


Exception Handling and Logging

Exception Handling:

There are three possible ways your worker's perform() method can finish when processing its current task:

  • return (without exception): The normal case where you just processed the task without errors. The task will be finished and marked as successful.
  • throw a RecoverableTaskException: If you get an error, but you see a chance that the same task could be successful when being processed next time, you can throw a RecoverableTaskException. This will cause the current task to be finished but retried later on. (Hint: For internal reasons, UnavailableException and IOException will also cause a retry.)
  • throw a "non-retry" Exception: These are all exceptions not mentioned before. The current task will be marked as failed and not be retried.

Logging:

You can use the log4j logging that comes with SMILA in your worker too. Your logging output will be logged in the standard smila.log.

  • import the package org.apache.commons.logging in the MANIFEST.MF.

Then your code could look somehow like that:

   private final Log _log = LogFactory.getLog(getClass());
   ...
   _log.debug("My worker was successful");
   ...


Create worker in new bundle resp. rename template bundle

For creating a new bundle:

  • Follow the description here to create a new bundle.

For renaming a bundle:

  • Right-click the bundle to rename in eclipse and select (Refactor/Rename).
  • Right-click java package and select (Refactor/Rename).
  • Open MANIFEST.MF and set a version property to the (renamed) exported package. Runtime/Exported Packages

Hint: if there are strange compile problems afterwards, and refresh resp. clean projects doesn't help, try restarting your eclipse IDE.

MANIFEST.MF / OSGI-INF / build.properties:

  • Apapt the changes in your OSGI-INF component description xml file
  • Please be sure that your OSGi component definition file is included in the MANIFEST.MF file in the Service-Component section! Otherwise the service component will not be recognized and thus not be started.
  • Please be sure that the OSGI-INF/ folder is included in your build.properties

test bundle:

  • Adapt the test bundle to the changes:
    • change name of test bundle and java package (Refactor/Rename, like described above for the worker bundle itself).
    • correct the imported packages in the code and the MANIFEST.MF (if not done correctly by refactoring)
    • adapt the test's run configuration, e.g. name, test bundle's java package, configuration file location (on tab "configuration")
    • adapt the config.ini file

Application launch:

  • Add the new/renamed bundle to the eclipse launcher and also to your application configuration/config.ini file with an appropriate start level.

feature project:

  • You have to add your new/renamed bundle to the feature project.
  • clear the destination folder for feature exports.