Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/JobManagerFirstExample"

(New page: This is a simple walkthrough on index building using the new JSON ReST APIs and the job management. %%information See [SMILA/Documentation/Processing/JSON_REST_API] on information how to ...)
 
(Finish the Job Run)
 
(34 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
This is a simple walkthrough on index building using the new JSON ReST APIs and the job management.
 
This is a simple walkthrough on index building using the new JSON ReST APIs and the job management.
  
%%information
+
{{note|
See [SMILA/Documentation/Processing/JSON_REST_API] on information how to use the JSON ReST API  
+
See [[SMILA/Documentation/Processing/JSON REST API for BPEL pipelines]] on how to use the JSON ReST API}}.
%%
+
  
== Create a workflow ==
+
=== Use a Workflow ===
  
First you have to define an asynchronous workflow. This one uses the BPEL pipelines from the standard configuration to add and delete index documents.
+
You could create your own asynchronous workflow, but we use the "indexUpdate" workflow that is already provided with SMILA. It uses the BPEL pipelines from the standard configuration to add and delete index documents.
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/jobmanager/workflows
+
GET /smila/jobmanager/workflows/indexUpdate/
 +
 
 +
HTTP/1.x 200 OK
 +
 
 
{
 
{
 
     "name": "indexUpdate",
 
     "name": "indexUpdate",
 
     "parameters":  
 
     "parameters":  
 
     {
 
     {
         "numberOfParallelRecords": "20"
+
         "pipelineRunBulkSize": "20"
 
     },
 
     },
 
     "startAction":
 
     "startAction":
Line 23: Line 25:
 
         {
 
         {
 
             "insertedRecords": "addBucket",
 
             "insertedRecords": "addBucket",
             "deletedRecords": "deletesBucket"
+
             "deletedRecords": "deleteBucket"
 
         }
 
         }
 
     },
 
     },
Line 29: Line 31:
 
     [
 
     [
 
         {
 
         {
             "worker": "pipelineProcessingWorker",
+
             "worker": "pipelineProcessor",
 
             "parameters":  
 
             "parameters":  
 
             {
 
             {
Line 40: Line 42:
 
         },
 
         },
 
         {
 
         {
             "worker": "pipelineProcessingWorker",
+
             "worker": "pipelineProcessor",
 
             "parameters":  
 
             "parameters":  
 
             {
 
             {
Line 47: Line 49:
 
             "input":
 
             "input":
 
             {
 
             {
                 "input": "deletesBucket"
+
                 "input": "deleteBucket"
 
             }
 
             }
 
         }
 
         }
 
     ]
 
     ]
}
 
</source>
 
 
The response will be something like:
 
 
<source lang="javascript">
 
{
 
    "name": "indexUpdate",
 
    "timestamp": "2011-08-15T16:18:27.269+0200",
 
    "url": "http://localhost:8080/smila/jobmanager/workflows/indexUpdate/"
 
 
}
 
}
 
</source>
 
</source>
Line 69: Line 61:
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/jobmanager/jobs
+
POST /smila/jobmanager/jobs/
 
{
 
{
     "name": "indexUpdate",
+
     "name": "exampleIndexUpdate",
 
     "workflow": "indexUpdate",
 
     "workflow": "indexUpdate",
 
     "parameters":  
 
     "parameters":  
Line 80: Line 72:
 
</source>
 
</source>
  
You get a similar repoonse:  
+
You get a reponse:  
  
 
<source lang="javascript">
 
<source lang="javascript">
 
{
 
{
     "name": "indexUpdate",
+
     "name": "exampleIndexUpdate",
 
     "timestamp": "2011-08-15T16:20:34.337+0200",
 
     "timestamp": "2011-08-15T16:20:34.337+0200",
     "url": "http://localhost:8080/smila/jobmanager/jobs/indexUpdate/"
+
     "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/"
 
}
 
}
 
</source>
 
</source>
Line 95: Line 87:
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/jobmanager/jobs/indexUpdate
+
POST /smila/jobmanager/jobs/exampleIndexUpdate/
 
</source>
 
</source>
  
Line 103: Line 95:
 
{
 
{
 
     "jobId": "20110815-162046851752",
 
     "jobId": "20110815-162046851752",
     "url": "http://localhost:8080/smila/jobmanager/jobs/indexUpdate/20110815-162046851752/"
+
     "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/"
 
}
 
}
 
</source>
 
</source>
  
=== Add a document ===  
+
We will need the URL from this response later to finish the job run.
 +
 
 +
=== Add a Document ===  
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/job/indexUpdate/record
+
POST /smila/job/exampleIndexUpdate/record/
 
{
 
{
 
   "_recordid": "test.html",
 
   "_recordid": "test.html",
Line 121: Line 115:
 
</source>
 
</source>
  
and flush the bulk:
+
Flush the bulk:
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/job/indexUpdate/record
+
POST /smila/job/exampleIndexUpdate/record/
 
</source>
 
</source>
  
Line 133: Line 127:
 
     "workflowRunId": "1",
 
     "workflowRunId": "1",
 
     "jobRunId": "20110815-162046851752",
 
     "jobRunId": "20110815-162046851752",
     "url": "http://localhost:8080/ias/jobmanager/jobs/indexUpdate/20110815-162046851752/workflowrun/1/"
+
     "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/workflowrun/1/"
 
}
 
}
 
</source>
 
</source>
  
After a short while the document can be found in the sample search site [[http://localhost:8080/SMILA/search]]. Hint: search for "first".
+
After a while (about a minute) the document can be found in the sample search site [http://localhost:8080/SMILA/search http://localhost:8080/SMILA/search]. Hint: search for "first".
  
=== Delete a document ===
+
=== Delete a Document ===
  
 
<source lang="javascript">
 
<source lang="javascript">
DELETE /smila/job/indexUpdate/record?_recordid=test.html
+
DELETE /smila/job/exampleIndexUpdate/record/?_recordid=test.html
 
</source>
 
</source>
  
and flush the bulk:
+
Flush the bulk:
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/job/indexUpdate/record
+
POST /smila/job/exampleIndexUpdate/record/
 
</source>
 
</source>
  
Line 157: Line 151:
 
     "workflowRunId": "2",
 
     "workflowRunId": "2",
 
     "jobRunId": "20110815-162046851752",
 
     "jobRunId": "20110815-162046851752",
     "url": "http://localhost:8080/ias/jobmanager/jobs/indexUpdate/20110815-162046851752/workflowrun/2/"
+
     "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/workflowrun/2/"
 
}
 
}
 
</source>
 
</source>
  
After a short while, the search should not return any results anymore.
+
After a while (about a minute), the search should not return any results anymore.
  
=== Finish the Job ===
+
=== Finish the Job Run ===
  
Look up the URL from the response of the start-job request and add "finish" in a POST request:
+
Look up the URL from the response of the start-job request and add "finish" to get the path for this POST request:
  
 
<source lang="javascript">
 
<source lang="javascript">
POST /smila/jobmanager/jobs/indexUpdate/20110815-162046851752/finish
+
POST /smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/finish/
 
</source>
 
</source>
  
The response will be empty, but you should get a response code of 202. Finally you can get statistics about this job run:
+
The response will be empty, but you should get a response code of 202.  
 +
 
 +
Finally you can request statistics about this job run:
  
 
<source lang="javascript">
 
<source lang="javascript">
GET /smila/jobmanager/jobs/indexUpdate/20110815-162046851752
+
GET /smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/
 
</source>
 
</source>
  
Line 184: Line 180:
 
     "finishTime": "2011-08-15T16:52:18.714+0200",
 
     "finishTime": "2011-08-15T16:52:18.714+0200",
 
     "jobId": "20110815-162046851752",
 
     "jobId": "20110815-162046851752",
     "runMode": "STANDARD",
+
     "mode": "STANDARD",
 
     "startTime": "2011-08-15T16:20:46.920+0200",
 
     "startTime": "2011-08-15T16:20:46.920+0200",
 
     "state": "SUCCEEDED",
 
     "state": "SUCCEEDED",

Latest revision as of 12:06, 26 January 2012

This is a simple walkthrough on index building using the new JSON ReST APIs and the job management.

.

Use a Workflow

You could create your own asynchronous workflow, but we use the "indexUpdate" workflow that is already provided with SMILA. It uses the BPEL pipelines from the standard configuration to add and delete index documents.

GET /smila/jobmanager/workflows/indexUpdate/
 
HTTP/1.x 200 OK
 
{
    "name": "indexUpdate",
    "parameters": 
    {
        "pipelineRunBulkSize": "20"
    },
    "startAction":
    {
        "worker": "bulkbuilder",
        "output":
        {
            "insertedRecords": "addBucket",
            "deletedRecords": "deleteBucket"
        }
    },
    "actions":
    [
        {
            "worker": "pipelineProcessor",
            "parameters": 
            {
                "pipelineName": "AddPipeline"
            },
            "input":
            {
                "input": "addBucket"
            }
        },
        {
            "worker": "pipelineProcessor",
            "parameters": 
            {
                "pipelineName": "DeletePipeline"
            },
            "input":
            {
                "input": "deleteBucket"
            }
        }
    ]
}

Create a Job

Now we have to create a job that uses this workflow:

POST /smila/jobmanager/jobs/
{
    "name": "exampleIndexUpdate",
    "workflow": "indexUpdate",
    "parameters": 
    {
        "tempStore": "tempStore"
    }
}

You get a reponse:

{
    "name": "exampleIndexUpdate",
    "timestamp": "2011-08-15T16:20:34.337+0200",
    "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/"
}

Start a Job Run

Now this job has to be started:

POST /smila/jobmanager/jobs/exampleIndexUpdate/

The response is:

{
    "jobId": "20110815-162046851752",
    "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/"
}

We will need the URL from this response later to finish the job run.

Add a Document

POST /smila/job/exampleIndexUpdate/record/
{
  "_recordid": "test.html",
  "_source": "handcrafted",
  "Title": "Hello Job World!",
  "Content": "This is the first document added to an SMILA index using the new job management",
  "MimeType": "text/plain",
  "Size": 42
}

Flush the bulk:

POST /smila/job/exampleIndexUpdate/record/

For both requests the response should be similar to:

{
    "workflowRunId": "1",
    "jobRunId": "20110815-162046851752",
    "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/workflowrun/1/"
}

After a while (about a minute) the document can be found in the sample search site http://localhost:8080/SMILA/search. Hint: search for "first".

Delete a Document

DELETE /smila/job/exampleIndexUpdate/record/?_recordid=test.html

Flush the bulk:

POST /smila/job/exampleIndexUpdate/record/

Again, you get a response for both request like this:

{
    "workflowRunId": "2",
    "jobRunId": "20110815-162046851752",
    "url": "http://localhost:8080/smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/workflowrun/2/"
}

After a while (about a minute), the search should not return any results anymore.

Finish the Job Run

Look up the URL from the response of the start-job request and add "finish" to get the path for this POST request:

POST /smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/finish/

The response will be empty, but you should get a response code of 202.

Finally you can request statistics about this job run:

GET /smila/jobmanager/jobs/exampleIndexUpdate/20110815-162046851752/

and get:

{
    "endTime": "2011-08-15T16:52:18.726+0200",
    "finishTime": "2011-08-15T16:52:18.714+0200",
    "jobId": "20110815-162046851752",
    "mode": "STANDARD",
    "startTime": "2011-08-15T16:20:46.920+0200",
    "state": "SUCCEEDED",
    "workflowRuns": {
        "activeWorkflowRunCount": 0,
        "canceledWorkflowRunCount": 0,
        "failedWorkflowRunCount": 0,
        "startedWorkflowRunCount": 2,
        "successfulWorkflowRunCount": 2
    },
    "tasks": {
        "canceledTaskCount": 0,
        "createdTaskCount": 4,
        "failedAfterRetryTaskCount": 0,
        "failedWithoutRetryTaskCount": 0,
        "obsoleteTaskCount": 0,
        "retriedAfterErrorTaskCount": 0,
        "retriedAfterTimeoutTaskCount": 0,
        "successfulTaskCount": 4
    },
    "worker": { ... },
    "jobDefinition": { ... }
}

Back to the top