Jump to: navigation, search

SMILA/Documentation/Processing/JSON REST API for BPEL pipelines

JSON REST API for SMILA BPEL pipelines

SMILA now has an HTTP REST API that allows managing and invoking BPEL processing workflows.

Reference

Note: The trailing slash in URLs is optional.

Get Pipeline Overview or Add/Update a Pipeline

GET: Returns a list of all deployed BPEL pipelines including URLs to access their definition.

POST: Adds or updates a BPEL pipeline. Returns an object containing the timestamp of the creation/modification and a URL to the pipeline definition.

Supported operations:

  • GET: Gets a list of all available BPEL pipelines.
  • POST: Adds or updates (if already existing) a BPEL pipeline. The request JSON object consists of a "name" and a "definition" field. The latter contains the pipeline description in BPEL format. If the respective pipeline is in use, the update process will take a little longer. Also, new invocations of this pipeline are blocked until the update is finished (about 100ms usually).

Usage:

  • URL: http://<hostname>:8080/smila/pipeline/
  • Allowed methods:
    • GET (no further URL parameters and no request body allowed)
    • POST (request body with a "name" and "definition" field is mandatory)
  • Response status codes:
    • 200 OK: Upon successful execution of GET.
    • 201 CREATED: Upon successful creation/update of the pipeline.
    • 400 BAD REQUEST: If you post with an empty body, if you try to update a predefined pipeline, if the request body has syntax errors or the name is invalid.
    • 500 INTERNAL SERVER ERROR: Any other error.

Get or Delete a Pipeline Definition

GET: Returns a JSON object containing the definition of the requested pipeline. The object consists of the name of the pipeline and the BPEL XML definition as a single string value. If the pipeline is predefined in the system configuration, the object will also contain a "readOnly": true flag. Otherwise, it will contain the timestamp of the latest add operation (see above) that created this version of the pipeline.

DELETE: Deletes the requested BPEL pipeline.

Supported operations:

  • GET: Get a BPEL pipeline definition.
  • DELETE: Delete the specified BPEL pipeline.

Usage:

  • URL: http://<hostname>:8080/smila/pipeline/<workflow-name>/
  • Allowed methods:
    • GET (no further URL parameters and no request body allowed)
    • DELETE (no further URL parameters and no request body allowed)
  • Response status codes:
    • 200 OK: Upon successful execution for GET and DELETE. If you try to delete a pipeline which does not exist, you will get 200 OK, too.
    • 404 NOT FOUND: If the specified BPEL pipeline does not exist.
    • 500 INTERNAL SERVER ERROR: Any other error.

Invoke a Pipeline

Process a record with the selected pipeline. Returns the result record as a JSON object. Record attachments are supported by using Multipart POST requests, see SMILA/Documentation/JettyHttpServer#Attachments for details and code example.

Supported operations:

  • GET: Invokes a pipeline where the record to process is specified using URL parameters.
  • POST: Invokes a pipeline where the record to process is contained in the request body as JSON (recommended).

Usage:

  • URL: http://<hostname>:8080/smila/pipeline/<workflow-name>/process/
  • Allowed methods:
    • POST
    • GET
  • Response status codes:
    • 200 OK: Upon successful execution
    • 400 BAD REQUEST: In case of empty input records or invalid JSON.
    • 405 METHOD NOT ALLOWED: In case of an invalid HTTP method (i.e. other than GET or POST).
    • 500 INTERNAL SERVER ERROR: Other errors, e.g. error during processing.
Current Limitations

Example Walkthrough

Note.png
Some Tool recommendations

See SMILA/Documentation/Using_The_ReST_API for recommendations on how to use the ReST API manually.

For sake of simplicity, the following examples were produced with Resty.


Preparation: Start SMILA and index some documents. E.g. follow the instructions on SMILA/Documentation for 5 Minutes to Success.

Accessing Pipelines

Let's start with the overview:

> resty http://localhost:8080
http://localhost:8080*
> GET /smila/pipeline/
{
  "pipelines" : [ {
    "name" : "AddFeedPipeline",
    "url" : "http://localhost:8080/smila/pipeline/AddFeedPipeline/"
  }, {
    "name" : "XmlSplitAndAddPipeline",
    "url" : "http://localhost:8080/smila/pipeline/XmlSplitAndAddPipeline/"
  }, {
    "name" : "DeletePipeline",
    "url" : "http://localhost:8080/smila/pipeline/DeletePipeline/"
  }, {
    "name" : "SearchPipeline",
    "url" : "http://localhost:8080/smila/pipeline/SearchPipeline/"
  }, {
    "name" : "AdaptFileCrawlerWorkerOutput",
    "url" : "http://localhost:8080/smila/pipeline/AdaptFileCrawlerWorkerOutput/"
  }, {
    "name" : "AdaptWebCrawlerWorkerOutput",
    "url" : "http://localhost:8080/smila/pipeline/AdaptWebCrawlerWorkerOutput/"
  }, {
    "name" : "AddPipeline",
    "url" : "http://localhost:8080/smila/pipeline/AddPipeline/"
  } ]
}

So we have seven pipelines deployed. Fine. Let's have a look at their definitions:

> GET /smila/pipeline/SearchPipeline/
{"name":"SearchPipeline","readOnly":true,"definition":"<?xml version=\"1.0\" encoding=\"utf-
8\" ?>\r\n<!--\r\n  * Copyright (c) 2009 empolis GmbH and brox IT Solutions GmbH.\r\n  * All
 rights reserved. This program and the accompanying materials\r\n  * are made available unde
r the terms of the Eclipse Public License v1.0\r\n  * which accompanies this distribution, a
nd is available at\r\n  * http://www.eclipse.org/legal/epl-v10.html\r\n  *\r\n  * Contributo
rs:\r\n  * Juergen Schumacher (empolis GmbH) - initial design\r\n-->\r\n<process name=\"Sear
chPipeline\" targetNamespace=\"http://www.eclipse.org/smila/processor\"\r\n  xmlns=\"http://
docs.oasis-open.org/wsbpel/2.0/process/executable\" xmlns:xsd=\"http://www.w3.org/2001/XMLSc
hema\"\r\n  xmlns:proc=\"http://www.eclipse.org/smila/processor\" xmlns:rec=\"http://www.ecl
ipse.org/smila/record\"\r\n  xmlns:bpel=\"http://docs.oasis-open.org/wsbpel/2.0/process/exec
utable\">\r\n\r\n  <import location=\"processor.wsdl\" namespace=\"http://www.eclipse.org/sm
ila/processor\"\r\n    importType=\"http://schemas.xmlsoap.org/wsdl/\" />\r\n\r\n  <partnerL
inks>\r\n    <partnerLink name=\"Pipeline\" partnerLinkType=\"proc:ProcessorPartnerLinkType\
" myRole=\"service\" />\r\n  </partnerLinks>\r\n\r\n  <extensions>\r\n    <extension namespa
ce=\"http://www.eclipse.org/smila/processor\" mustUnderstand=\"no\" />\r\n  </extensions>\r\
n\r\n  <variables>\r\n    <variable name=\"request\" messageType=\"proc:ProcessorMessage\" /
>\r\n  </variables>\r\n\r\n  <sequence name=\"SearchPipeline\">\r\n    <receive name=\"start
\" partnerLink=\"Pipeline\" portType=\"proc:ProcessorPortType\"\r\n      operation=\"process
\" variable=\"request\" createInstance=\"yes\" />\r\n\r\n    <extensionActivity>\r\n      <p
roc:invokePipelet name=\"invokeSolrSearchPipelet\">\r\n        <proc:pipelet class=
\"org.eclipse.smila.solr.search.SolrSearchPipelet\" />\r\n        <proc:variables input=\"re
quest\" output=\"request\" />\r\n        <proc:configuration>\r\n            <rec:Val key=\"
indexname\">DefaultCore</rec:Val>\r\n            <rec:Map key=\"_solr.query\">\r\n          
      <rec:Seq key=\"highlighting\">            \r\n                    <rec:Map>\r\n       
               <rec:Val key=\"attribute\">global.solr.params</rec:Val>\r\n                  
    <rec:Val key=\"hl\" type=\"boolean\">true</rec:Val>\r\n                      <rec:Val ke
y=\"hl.fl\">Content</rec:Val>\r\n                      <rec:Val key=\"hl.simple.pre\">&lt;b&
gt;</rec:Val>\r\n                      <rec:Val key=\"hl.simple.post\">&lt;/b&gt;</rec:Val> 
            \r\n                    </rec:Map>\r\n                </rec:Seq>                
 
    \r\n            </rec:Map>                \r\n        </proc:configuration>\r\n      </p
roc:invokePipelet>\r\n    </extensionActivity>\r\n    \r\n    <reply name=\"end\" partnerLin
k=\"Pipeline\" portType=\"proc:ProcessorPortType\" operation=\"process\"\r\n      variable=\
"request\" />\r\n  </sequence>\r\n</process>\r\n"}

Whoops, what's this? It's a bit awkward to read because newline and double-quote characters are printed in their JSON-escaped form. In a browser with a "JSONView" extension installed it looks quite readable, e.g. in Chrome:

SMILA-bpel-pipeline-in-jsonview-browser.png

That's better. We see that the object has a readOnly flag set to true, because it is one of the predefined pipelines in the system configuration. If it was a custom pipeline defined via the API, there would be a timestamp attribute at the end of the object.

Now we try to execute this pipeline:

> GET /smila/pipeline/SearchPipeline/process/
{
  "message" : "Cannot process an empty record."
}
> POST /smila/pipeline/SearchPipeline/process/
{
  "message" : "Cannot process an empty record."
}
So this did not work, as expected. In the response headers you can see the error code returned:
> POST /smila/pipeline/SearchPipeline/process/ -v
...
< HTTP/1.1 400 Bad Request
...
{
  "message" : "Cannot process an empty record."
}

So we add a query attribute:

> POST /smila/pipeline/SearchPipeline/process/ '{ "query": "SMILA" }'
...
< HTTP/1.1 500 Server Error
...
{
  "message" : "Error processing BPEL workflow SearchPipeline: Invocation of pipeline element SearchPipeline/search failed: Error processing message SearchPipeline-7afe423a-749c-4492-aa66-38ce37dba672\ncaused by: Invocation of pipeline element SearchPipeline/search failed: Error processing message SearchPipeline-7afe423a-749c-4492-aa66-38ce37dba672\ncaused by: no single value for required parameter QueryAttribute"
}

This time the LuceneSearchPipelet complains about a missing parameter. So let's add it:

> POST /smila/pipeline/SearchPipeline/process/ '{ 
    "query": "SMILA", 
    "QueryAttribute": "Content" }' 
{
  "query" : "SMILA",
  "QueryAttribute" : "Content",
  "_recordid" : "SearchPipeline-5c2d3f3f-1e56-4362-aa4c-74aa5fa9d6e8",
  "count" : 58,
  "indexSize" : 115,
  "records" : [ {
    "_recordid" : "feeds:<Uri=tag:search.twitter.com,2005:69739397733560320>",
    "_source" : "feeds",
    "_weight" : 0.84
  }, {
    "_recordid" : "feeds:<Uri=tag:search.twitter.com,2005:69960011966713856>",
    "_source" : "feeds",
    "_weight" : 0.73
  }, 
  ...
  ] }
}

A successful search ... but we probably want to see a bit more information. So we add values for the parameter resultAttributes:

> POST /smila/pipeline/SearchPipeline/process/ '{ 
  "query": "SMILA", 
  "QueryAttribute": "Content", 
  "resultAttributes": [ "Title", "Author", "LastModifiedDate" ] }' 
{
  "query" : "SMILA",
  "QueryAttribute" : "Content",
  "resultAttributes" : [ "Title", "Author", "LastModifiedDate" ],
  "_recordid" : "SearchPipeline-2201c83a-a9b6-4cfe-a62e-6ec8ffe113ca",
  "count" : 58,
  "indexSize" : 115,
  "records" : [ {
    "_recordid" : "feeds:<Uri=tag:search.twitter.com,2005:69739397733560320>",
    "_source" : "feeds",
    "_weight" : 0.84,
    "Title" : "@AbrarAlAdwani smila 3alech. Latwaswiseen",
    "Author" : "dee_the_bee (Dalalee Boland)",
    "LastModifiedDate" : "2011-05-15T14:22:22.000+0200"
  }, {
    "_recordid" : "feeds:<Uri=tag:search.twitter.com,2005:69960011966713856>",
    "_source" : "feeds",
    "_weight" : 0.73,
    "Title" : "Finally found something to smila bout :)",
    "Author" : "MIGUELALMENDRAL (Miguel FG Almendral)",
    "LastModifiedDate" : "2011-05-16T04:59:01.000+0200"
  },
  ...
  ] }
}

Awesome! In the same way you can add more parameters as you need:

> POST /smila/pipeline/SearchPipeline/process/ '{ 
    "query": "SMILA", 
    "QueryAttribute": "Content", 
    "resultAttributes": [ "Title", "Author", "LastModifiedDate" ], 
    "maxcount": 1, "offset": 42, "highlight": "Content" }' 
{
  "query" : "SMILA",
  "QueryAttribute" : "Content",
  "resultAttributes" : [ "Title", "Author", "LastModifiedDate" ],
  "maxcount" : 1,
  "offset" : 42,
  "highlight" : "Content",
  "_recordid" : "SearchPipeline-f2a53434-36b1-4e25-a716-f8c02fba5ecd",
  "count" : 58,
  "indexSize" : 115,
  "records" : [ {
    "_recordid" : "feeds:<Uri=http://www.eclipse.org/forums/index.php/mv/msg/206311/660699/#msg_660699>",
    "_source" : "feeds",
    "_weight" : 0.19,
    "Title" : "Re: New SMILA tryout",
    "Author" : "Andreas Weber",
    "_highlight" : {
      "Content" : {
        "text" : "<br />\n&#62;<br />\n&#62; I've done all standard steps as we do when creating a new<br />\n&#62; <b>SMILA</b>-workspace (and as described...-workspace (and as described on the <b>SMILA</b> website - dont know the<br... website - dont know the<br />\n&#62; exact name right now):<br />\n&#62;<br />\n&#62;<br />\n&#62; check out from eclipse SVN<br />\n&#62; add a new Target Platform with path to the <b>SMILA</b> bundles and to eclipse<br..."
      }
    }
  } ]
}

You can also use GET and URL parameters to invoke the pipeline, just enter something like this in the address line of your favorite browser:

http://localhost:8080/smila/pipeline/SearchPipeline/process?query=SMILA&QueryAttribute=Content

However, this gets inconvenient when you want to add lots of parameters and attributes.

Defining Pipelines

To define a new pipeline (or update it) you POST a JSON containing the pipeline name and BPEL definition to /smila/pipeline. E.g. copy the output of the GET /smila/pipeline/SearchPipeline command and rename the pipeline to "SearchPipeline2" like this:

> POST /smila/pipeline/ '{
  "name" : "SearchPipeline2",
  "definition" : "<?xml version=\"1.0\" ?>\r\n<process name=\"SearchPipeline2\"  ..."
}'
{
  "name" : "SearchPipeline2",
  "timestamp" : "2011-08-26T13:54:34.451+0200",
  "url" : "http://localhost:8080/smila/pipeline/SearchPipeline2/"
}

If you want to push an own BPEL definition, take care of escaping linefeed (\r), newline (\n) and quotes (\") characters, or the JSON code will not be valid.

The response contains the name, a creation timestamp and an URL to read the pipeline definition again. If you use this, you will see that the response will also contain the timestamp, but no readOnly flag:

> GET /smila/pipeline/SearchPipeline2/
{
  "name" : "SearchPipeline2",
  "definition" : "<?xml version=\"1.0\" ?>\r\n<process name=\"SearchPipeline2\" ...",
  "timestamp" : "2011-08-26T13:54:34.451+0200"
}

Updating the workflow would work just the same. The timestamp can be used in modelling tools to ensure that different users do not overwrite changes made by another user.

The new pipeline should now also appears in the list of pipelines:

> GET /smila/pipeline/
{
  "pipelines" : [ {
  ...,
  {
    "name" : "SearchPipeline",
    "url" : "http://localhost:8080/smila/pipeline/SearchPipeline/"
  }, {
    "name" : "SearchPipeline2",
    "url" : "http://localhost:8080/smila/pipeline/SearchPipeline2/"
  },
  ...]
}