Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SMILA/Documentation/Scripting

< SMILA‎ | Documentation
Revision as of 04:29, 23 October 2014 by Andreas.weber.empolis.com (Talk | contribs) (Using Pipelets - best practice)

Scripting SMILA using Javascript

Work In Progress

Service Description

  • Bundles: org.eclipse.smila.scripting(.test)
  • OSGi service interface: org.eclipse.smila.scripting.ScriptingEngine
  • Service implementation: org.eclipse.smila.scripting.internal.JavascriptEngine

The ScriptingEngine provides an alternative for describing "synchronous workflows" by using Javascript functions instead of BPEL processes. This approach is easier, more flexible and more maintainable (e.g. debugable), so one day the BPEL approach might be removed completely.

Scripts Basics

A javascript function for SMILA scripting takes one record (including attachments) as an argument, and can return one record (other return types are supported, too and wrapped in a record automatically). For example, a file helloWorld.js (the suffix must be ".js") could look like this:

function greetings(record) {
  record.greetings = "Hello " + record.name + "!";
  return record;
}

Script files to execute are added by default to SMILA/configuration/org.eclipse.smila.scripting/js. They are currently loaded "on-demand" and not stored in the service for reuse, so changes in the files will be effective for the next execution.

A script is invoked using the ScriptingEngine.callScript() methods. The first argument of both methods is a "scriptName" string in format "<file>.<function>" where the <file> part is the name of the script file (without path and ".js" suffix) and the <function> part is the name of a function defined in this file.

Exposing script functions

The script directory can contain "script catalog" files. They can be used to expose and describe available scripts in the ReST API so that a client can detect available scripts. Such a file must be named <prefix>ScriptCatalog.js, e.g. smilaScriptCatalog.js and must have this format:

[
  {
    name: "helloWorld.greetings",
    description: "Get a Hello from SMILA!"
  },
  // ... more function descriptions
]

A catalog file does not define functions, it just produces an array of script function descriptions. A description object must contain a "name" property, we recommend to include a "description" property. Other properties can be added as you like (e.g. a structured description of expected parameters in the passed record).

The ScriptingEngine.listScripts() method merges the arrays produced by all catalog scripts into one array (elements that are not objects or do not have a "name" property are ignored) and sorts them by name.

The name property must be in format <file>.<function>, as described above for the scriptName parameter of the callScripts() functions.

Configuration

  • The script directory can be changed on startup using a system property: SMILA -Dsmila.scripting.dir=/home/smila/js .... The system property can also be added to SMILA.ini, of course.


Scripting Features ("SDK")

See the Rhino Documentation for special Javascript features available in Rhino. They should work in SMILA, too. Especially the predefined functions available in Rhino Shell should work in SMILA, too (if they are useful). For example, you can use print(...) to write something to the console:

  print("Hello World!");

(However, the quit() function will do nothing ;-)

Working with Records

The record passed to the script can be accessed just like a native Javascript object. The record attributes are just treated as object properties:

  record.string = "a string";
  record["integer"] = 42;
  record.double = 3.14;
  record.boolean = true;
  record.map = {
    key : "value"
  };
  record.sequence = [ "Hello", record.string, record.integer, record.double ];

  delete record.name;

Iterating over maps and sequences is possible, too:

  for ( var key in record.map) {
    print("map " + key + " to " + record.map.key);
  }

  for ( var index in record.sequence) {
    print("element " + index + ": " + record.sequence[index]);
  }  

The record object has three special properties, whose names start with a dollar sign ($):

  • $id: The string value of attribute _recordid. This is just a convenience property. It can be used to read and write the record ID:
  var recordId = record.$id;
  record.$id = "changed-id";
  • $metadata: in some cases it is necessary to use the actual AnyMap object containing the record metadata, for example if you want to call a Java method that defines a parameter of type Any or AnyMap:
  var writer = new org.eclipse.smila.datamodel.ipc.IpcAnyWriter(true);
  var recordAsJson = writer.writeJsonObject(record.$metadata);
  • $attachments: contains an object that provides access to the record attachments. Its properties correspond to attachment names and can be used to get and set attachment contents of the record
    • When reading an attachment, an actual org.eclipse.smila.datamodel.Attachment object is returned that can be access by using the Java methods and passed to other Java objects:
  var attachment = record.$attachments.Content;
  var contentLength = attachment.size();
  var contentAsByteArray = attachment.getAsBytes();
  var contentAsStream = attachment.getAsStream();
  
  var contentAsString = new java.lang.String(contentAsByteArray, "utf-8");
    • To set an attachment, several types of objects are supported to provide the content:
      • Java byte Arrays, of course:
  record.$attachment.fromBytes = contentAsByteArray;
      • String (more exactly, java.lang.CharSequence) objects are converted to byte arrays using UTF-8 encoding:
  record.$attachments.fromString = "string attached";
      • java.io.InputStream objects are read into an byte array and set as an attachment. The stream will be closed after the operation:
  var stream = new FileInputStream(filename);
  record.$attachments.fromStream = stream
      • An org.eclipse.smila.datamodel.Attachment can be used, too. If the names match, the actual Attachment object will just be attached to the record. Else the implementation will fetch the content from the source attachment and create a new Attachment object from it (with the current implementation of Attachments in SMILA this will NOT result in copying the actual byte[]). If getting the content does not work, an error will be thrown (however, this cannot happen currently).
  record.$attachments.copyAttachment = record.$attachments.originalAttachment
    • To delete an attachment, use the delete operator:
  delete record.$attachments.Content;

record.$attachments and record.$metadata cannot be used for write-access themselves. The delete operator will not work on any of the special properties.

Accessing OSGi services

Any active OSGi services in the SMILA VM can be easily accessed from within a script. Just use the globally registered services object. For example:

  • Use LanguageIdentifier service:
  var languageId = services.find("org.eclipse.smila.common.language.LanguageIdentifyService");
  record.language = languageId.identify(record.Content).getIsoLanguage();
  • Write record to ObjectStore:
  var objectstore = services.find("org.eclipse.smila.objectstore.ObjectStoreService");
  objectstore.ensureStore("store-created-by-script");

  var bonWriter = new org.eclipse.smila.datamodel.ipc.IpcAnyWriter(true);
  var bonObject = bonWriter.writeBinaryObject(record.$metadata);
  
  objectstore.putObject("store-created-by-script", "bon-object", bonObject);

See the service documentations for details on how to use them.

Using Pipelets

It is also possible to use pipelets. You must create a pipelet instance first using the global pipelets.create function and a configuration object, then you can invoke the created pipelet instance using the process function of the instance:

function processTika(record)
  var tikaConfig = {
    "inputType" : "ATTRIBUTE",
    "outputType" : "ATTRIBUTE",
    "inputName" : "Content",
    "outputName" : "PlainContent",
    "contentTypeAttribute" : "MimeType",
    "exportAsHtml" : false,
    "maxLength" : "-1",
    "extractProperties" : [ {
      "metadataName" : "title",
      "targetAttribute" : "Title",
      "singleResult" : true
    } ]
  };
  var tika = pipelets.create("org.eclipse.smila.tika.TikaPipelet", tikaConfig);
  tika.process(record);
  return record;

The process() function accepts single records and arrays of records as well as single or arrays of Javascript objects that can be converted to AnyMap objects. Arrays of records or objects will be processed in a single pipelet invocation.

The process function always return an array of records, even if only one record was given as input. That's due to the fact, that some pipelets create new records resp. split the input record into multiple output records.

So the signature of the process function looks like this:

  Record[] process(Record)
  Record[] process(Record[])
  Record[] process(AnyMap)
  Record[] process(AnyMap[])
  Record[] process(<Javascript-Map>)
  Record[] process(<Javascript-Map>[])

The result of a pipelet invocation can be given to another pipelet for further processing or returned as the final function result.

Using Pipelets - best practice:

In normal case, pipelets will just work on (resp. modify) given input records, but not create new records. In this case, don't use the result of a pipelet for further script processing but just work with the input record. So you don't have to care about the process function always returning an array as result.

Example-1: Best practice

function processTika(record)
  ... 
  my1stPipelet.process(record);
  record.greetings = "Hello world";
  my2ndPipelet.process(record);
  ...
  return record;

Example-2: When working with the pipelet result, you 'd have to deal with arrays:

function processTika(record)
  ... 
  var result1 = my1stPipelet.process(record);
  result1[0].greetings = "Hello world";
  var result2 = my2ndPipelet.process(result1);
  ...
  return result2[0];

Loading other scripts

TODO

Logging

TODO

HTTP REST API

Manage Scripts

__URL:__ http://<hostname>:8080/smila/script

Methods:

  • GET: list exposed scripts - show the result of ScriptingEngine.listScripts(). No parameters, no request body.

Response:

  • result of ScriptingEngine.listScripts(), wrapped as a JSON object with property "scripts" containing the array of script descriptions:
{
    "scripts": [
        {
            "name": "helloWorld.greetings",
            "description": "Get a Hello from SMILA!",
            "url": "http://localhost:8080/smila/script/helloWorld.greetings/"
        }
    ]
}


__URL:__ http://<hostname>:8080/smila/script/<scriptfile>.<function>

Methods:

  • GET: show script description. No parameters, no request body.

Response:

  • description object from above list with the matching name.

Response-Codes

  • 200 OK: Success
  • 404 Not Found: Function is not exposed in any ScriptCatalog file.

Example: GET /smila/script/helloWorld.greetings yields:

{
    "name": "helloWorld.greetings",
    "description": "Get a Hello from SMILA!",
    "url": "http://localhost:8080/smila/script/helloWorld.greetings/"
}

Execute a script

__URL:__ http://<hostname>:8080/smila/script/<script-file>.<function>

Methods

  • POST: execute script with record in request body. Attachments are supported, too.

Response:

  • Metadata part of result of ScriptingEngine.callScript("<script-file>.<function>", requestRecord). If the result contains attachments they are not returned via the ReST-API.

Response-Codes

  • 200 OK: Script executed successfully.
  • 400 Bad Request: Last URL part does not have <script-file>.<function> format, or error in Script execution
  • 404 Not Found: Script file does not exist or does not contain the function

Example request:

POST http://localhost:8080/smila/script/helloWorld.greetings
{
  "name": "Juergen"
}

Response:

{
    "name": "Juergen",
    "greetings": "Hello Juergen!"
}

Back to the top