Difference between revisions of "SMILA/Documentation/HowTo/How to access the REST API with the RestClient"

From Eclipsepedia

Jump to: navigation, search
(Interfaces and default implementations)
Line 181: Line 181:
 
// We should check if it is still be running, etc..
 
// We should check if it is still be running, etc..
 
restClient.post(resourceHelper.getJobResource("indexUpdate"));
 
restClient.post(resourceHelper.getJobResource("indexUpdate"));
} catch (RestException | IOException e1) {
+
} catch (RestException e) {
 
// TODO Auto-generated catch block
 
// TODO Auto-generated catch block
e1.printStackTrace();
+
e.printStackTrace();
 +
} catch (IOException e) {
 +
// TODO Auto-generated catch block
 +
e.printStackTrace();
 
}
 
}
 
 
Line 192: Line 195:
 
// POST with no body to start the Job in default mode
 
// POST with no body to start the Job in default mode
 
restClient.post(resourceHelper.getJobResource(jobName));
 
restClient.post(resourceHelper.getJobResource(jobName));
} catch (RestException | IOException e) {
+
} catch (RestException e) {
 +
// TODO Auto-generated catch block
 +
e.printStackTrace();
 +
} catch (IOException e) {
 
// TODO Auto-generated catch block
 
// TODO Auto-generated catch block
 
e.printStackTrace();
 
e.printStackTrace();

Revision as of 07:31, 8 February 2012

SMILA provides an extensive REST API to control SMILA, check the status, import or search data, attach workers to its job control etc.

This HowTo describes how to utilize the included RestClient to access SMILA's REST API from within a Java application.

Contents

Preconditions

For the sake of simplicity, we assume that you check out the complete SMILA development environment, although it would be sufficient to just check out the relevant bundles to be able to access SMILA. This would be the case, to given some examples, for an asynchronous SMILA worker running in a JRE different to SMILA's JRE or a testing application accessing SMILA via the REST API etc.

Basics

Interfaces and default implementations

The RestClient interface encapsulates the REST access to SMILA. It provides methods for GET, POST, PUT and DELETE calls to the REST API and represents data using SMILA's Any interface and attachments using the Attachments interface. The latter allow working with binary data in SMILA.

The package org.eclipse.smila.http.client.impl provides a default implementation for the RestClient named DefaultRestClient.

There are two helper classes providing the resources as described in REST API Reference:

  • ResourceHelper for all resources beginning with /smila, except for those that are marked as deprecated in the REST API Reference.
  • TaskManagerClientHelper to provide workers that are not directly driven by the WorkerManager with resources for task handling (internal TaskManager REST API, i.e. the resources beginning with /taskmanager).

Accessing SMILA

To access SMILA via its REST interface, instantiate the RestClient, like:

RestClient restClient = new DefaultRestClient();

The following code snippet creates a job definition, sends it to the JobManager and starts it if posting was successful:

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "crawlCData";
 
// create job description as an AnyMap
final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
jobDescription.put("name", jobName);
jobDescription.put("workflow", "fileCrawling");
final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
parameters.put("tempStore", "temp");
parameters.put("jobToPushTo", "importJob");
parameters.put("dataSource", "file_data");
parameters.put("rootFolder", "c:/data");
jobDescription.put("parameters", parameters);
 
 
// the resourcehelper provides us with the resource to the jobs API
// we send the (AnyMap) job description in the POST body
restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
// POST (here without a body) to start the Job,
// the ResourceHelper provides the resource to the named job
restClient.post(resourceHelper.getJobResource(jobName));

The following snippet checks if the job with the given name is already running, if not, it is started, and a record with an attachment is sent to it.

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "indexUpdate";
 
// check for a current run of this job
final AnyMap currentJobRun =
restClient.get(resourceHelper.getJobResource(jobName)).getMap("runs").getMap("current");
if (currentJobRun != null && !currentJobRun.isEmpty()) {
  // a current run exists, so we don't need to start one but it may not be running.
  if (!"RUNNING".equalsIgnoreCase(currentJobRun.getStringValue("state"))) {
    // well it's just an example...
    throw new IllegalStateException("Job '" + jobName + "' is not running but has status '"
	  + currentJobRun.getStringValue("state") + "'.");
  }
} else {
  // no current job run, start another one.
  restClient.post(resourceHelper.getJobResource(jobName));
}
 
// create attachment with a file's content
final File file = new File("c:/data/notice.html");
final Attachments attachments = new AttachmentWrapper("file", file);
// put some sample metadata
final AnyMap metadata = DataFactory.DEFAULT.createAnyMap();
metadata.put("_recordid", "1");
metadata.put("fileName", file.getCanonicalPath());
// now post metadata with an attachment from a file.
// if we had a Record with attachments, we could POST that one...
// note: we could add more than one attachment using the AttachmentWrapper.
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), metadata, attachments);

Using Attachments with the RestClient

As seen above, the RestClient bundle provides an Attachments interface allowing attachments to be POSTed. An attachment consists of a string key and binary data that will be POSTed as application/octet-stream in a multi-part message.

Handling attachments manually

You can use the AttachmentWrapper in order to add attachments from the following sources if you want to handle attachments manually:

  • a byte[]
  • a String
  • a File
  • an InputStream

There are convenience constructors to provide an attachment when constructing an AttachmentWrapper but you can add more than one attachment and mix the types.

Example:

final RestClient restClient = new DefaultRestClient();
byte[] byteAttachment = new byte[1000];
String stringAttachment = "string attachment";
File fileAttachment = new File("c:/data/notice.html");
InputStream inputStreamAttachment = new FileInputStream(fileAttachment);
 
AttachmentWrapper attachments = new AttachmentWrapper("byte-data", byteAttachment);
attachments.add("string-data", stringAttachment);
attachments.add("file-data", fileAttachment);
attachments.add("stream-data", inputStreamAttachment);
 
restClient.post(resource, parameters, attachments);

Handling attachments with records

SMILA records can also include attachments, and since SMILA's target data units are records, it is natural, that the RestClient also supports records (with attachments) directly.

That means, the record's metadata will be sent with the records' attachments as parts of a multi-part message.

Example:

final byte[] data1 = ...;
final byte[] data2 = ...;
record.setAttachment("data1", data1);
record.setAttachment("data2", data2);
 
// POST the record with the attachments
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), record);

Using the RestClient without the complete development environment

This section describes the steps to follow when using the RestClient from a Java application outside SMILA's JRE.

  1. Build or download the SMILA distribution.
  2. Create a new workspace.
  3. Create a Java project of your gusto.
  4. Add the following JARs from your downloaded/built SMILA application to the Java Build Path of your new project (exact version numbers are ommitted in this list and replaced by *, just use the latest version you'll find in your SMILA application):
    • from the plugins directory:
      • org.apache.commons.collections_*.jar
      • org.apache.commons.io_*.jar
      • org.apache.commons.lang_*.jar
      • org.apache.httpcomponents.httpclient_*.jar (>=4.1)
      • org.apache.httpcomponents.httpcore_*.jar (>=4.1)
      • org.apache.log4j_*.jar
      • org.codehaus.jackson.core_*.jar
      • org.eclipse.smila.datamodel_*.jar
      • org.eclipse.smila.http.client_*.jar
      • org.eclipse.smila.ipc_*.jar
      • org.eclipse.smila.utils_*.jar
    • from the plugins/org.apache.commons.logging_*/lib directory
      • commons-logging-*.jar

Now you have all means to access SMILA's REST API from another Java application.

E.g. you could now write a simple program that creates and starts up a crawl job and the indexUpdate-job:

Note.png
To keep the example simple, it plainly ignores exceptions and does not check if a job exists or is running or other such matters. Feel free to add these niceties to your own code as an exercise, if you like.
public class CrawlMyData {
 
	public static void main(String[] args) {
		final RestClient restClient = new DefaultRestClient();
		final ResourceHelper resourceHelper = new ResourceHelper();
		final String jobName = "crawlCData";
 
		// create job description as an AnyMap
		final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
		jobDescription.put("name", jobName);
		jobDescription.put("workflow", "fileCrawling");
		final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
		parameters.put("tempStore", "temp");
		parameters.put("jobToPushTo", "indexUpdate");
		parameters.put("dataSource", "file_data");
		parameters.put("rootFolder", "c:/data");
		jobDescription.put("parameters", parameters);
 
		try {
			// start the referred job "indexUpdate" that indexes our sent data.
			// We should check if it is still be running, etc..
			restClient.post(resourceHelper.getJobResource("indexUpdate"));
		} catch (RestException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
 
		try {
			// create (or update) the job, we chould check if it exists or is runnung, etc...
			restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
			// POST with no body to start the Job in default mode
			restClient.post(resourceHelper.getJobResource(jobName));
		} catch (RestException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
 
	}
}

Putting it together

Now start your SMILA application and when it's up, run the application from above and watch the jobs using your preferred REST client (e.g. browser plugin, see Interactive REST tools) at http://localhost:8080/smila/jobmanager/jobs/.

You should see:

  • the newly created job crawlCData,
  • the job indexUpdate is RUNNING,
  • the job crawlCData is FINISHING (or has already finished, depending on the amount of data in your crawl directory).

Wait some time and you can then search your crawled data at </tt>http://localhost:8080/SMILA/search</tt>.

Links