Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/JobManagerFirstExample"

(Finish the Job)
(Create a workflow)
Line 22: Line 22:
 
         {
 
         {
 
             "insertedRecords": "addBucket",
 
             "insertedRecords": "addBucket",
             "deletedRecords": "deletesBucket"
+
             "deletedRecords": "deleteBucket"
 
         }
 
         }
 
     },
 
     },
Line 46: Line 46:
 
             "input":
 
             "input":
 
             {
 
             {
                 "input": "deletesBucket"
+
                 "input": "deleteBucket"
 
             }
 
             }
 
         }
 
         }

Revision as of 11:16, 15 August 2011

This is a simple walkthrough on index building using the new JSON ReST APIs and the job management.

Note.png
See SMILA/Documentation/Processing/JSON_REST_API on information how to use the JSON ReST API


Create a workflow

First you have to define an asynchronous workflow. This one uses the BPEL pipelines from the standard configuration to add and delete index documents.

POST /smila/jobmanager/workflows/
{
    "name": "indexUpdate",
    "parameters": 
    {
        "numberOfParallelRecords": "20"
    },
    "startAction":
    {
        "worker": "bulkbuilder",
        "output":
        {
            "insertedRecords": "addBucket",
            "deletedRecords": "deleteBucket"
        }
    },
    "actions":
    [
        {
            "worker": "pipelineProcessingWorker",
            "parameters": 
            {
                "pipelineName": "AddPipeline"
            },
            "input":
            {
                "input": "addBucket"
            }
        },
        {
            "worker": "pipelineProcessingWorker",
            "parameters": 
            {
                "pipelineName": "DeletePipeline"
            },
            "input":
            {
                "input": "deleteBucket"
            }
        }
    ]
}

The response will be something like:

{
    "name": "indexUpdate",
    "timestamp": "2011-08-15T16:18:27.269+0200",
    "url": "http://localhost:8080/smila/jobmanager/workflows/indexUpdate/"
}

Create a Job

Now we have to create a job that uses this workflow:

POST /smila/jobmanager/jobs/
{
    "name": "indexUpdate",
    "workflow": "indexUpdate",
    "parameters": 
    {
        "tempStore": "tempStore"
    }
}

You get a similar repoonse:

{
    "name": "indexUpdate",
    "timestamp": "2011-08-15T16:20:34.337+0200",
    "url": "http://localhost:8080/smila/jobmanager/jobs/indexUpdate/"
}

Start a Job Run

Now this job has to be started:

POST /smila/jobmanager/jobs/indexUpdate/

The response is:

{
    "jobId": "20110815-162046851752",
    "url": "http://localhost:8080/smila/jobmanager/jobs/indexUpdate/20110815-162046851752/"
}

We will need the URL from this response later to finish the job run.

Add a document

POST /smila/job/indexUpdate/record/
{
  "_recordid": "test.html",
  "_source": "handcrafted",
  "Title": "Hello Job World!",
  "Content": "This is the first document added to an SMILA index using the new job management",
  "MimeType": "text/plain",
  "Size": 42
}

and flush the bulk:

POST /smila/job/indexUpdate/record/

For both requests the response should be similar to:

{
    "workflowRunId": "1",
    "jobRunId": "20110815-162046851752",
    "url": "http://localhost:8080/ias/jobmanager/jobs/indexUpdate/20110815-162046851752/workflowrun/1/"
}

After a short while the document can be found in the sample search site http://localhost:8080/SMILA/search. Hint: search for "first".

Delete a document

DELETE /smila/job/indexUpdate/record/?_recordid=test.html

and flush the bulk:

POST /smila/job/indexUpdate/record/

Again, you get a response for both request like this:

{
    "workflowRunId": "2",
    "jobRunId": "20110815-162046851752",
    "url": "http://localhost:8080/ias/jobmanager/jobs/indexUpdate/20110815-162046851752/workflowrun/2/"
}

After a short while, the search should not return any results anymore.

Finish the Job Run

Look up the URL from the response of the start-job request and add "finish" to get the path for this POST request:

POST /smila/jobmanager/jobs/indexUpdate/20110815-162046851752/finish/

The response will be empty, but you should get a response code of 202.

Finally you can get statistics about this job run:

GET /smila/jobmanager/jobs/indexUpdate/20110815-162046851752/

and get:

{
    "endTime": "2011-08-15T16:52:18.726+0200",
    "finishTime": "2011-08-15T16:52:18.714+0200",
    "jobId": "20110815-162046851752",
    "runMode": "STANDARD",
    "startTime": "2011-08-15T16:20:46.920+0200",
    "state": "SUCCEEDED",
    "workflowRuns": {
        "activeWorkflowRunCount": 0,
        "canceledWorkflowRunCount": 0,
        "failedWorkflowRunCount": 0,
        "startedWorkflowRunCount": 2,
        "successfulWorkflowRunCount": 2
    },
    "tasks": {
        "canceledTaskCount": 0,
        "createdTaskCount": 4,
        "failedAfterRetryTaskCount": 0,
        "failedWithoutRetryTaskCount": 0,
        "obsoleteTaskCount": 0,
        "retriedAfterErrorTaskCount": 0,
        "retriedAfterTimeoutTaskCount": 0,
        "successfulTaskCount": 4
    },
    "worker": { ... },
    "jobDefinition": { ... }
}