Jump to: navigation, search

SMILA/Documentation/HowTo/How to access the REST API with the RestClient

SMILA provides an extensive REST API to control SMILA, check the status, import or search data, attach workers to its job control etc.

This HowTo describes how to utilize the SMILA RestClient to access SMILA's REST API from within a Java application.

Preconditions

For the sake of simplicity, we assume that you check out the complete SMILA development environment, although it would be sufficient to just check out the relevant bundles to be able to access SMILA. This would be the case, to given some examples, for an asynchronous SMILA worker running in a JRE different to SMILA's JRE or a testing application accessing SMILA via the REST API etc.

Basics

Interfaces and default implementations

The RestClient interface encapsulates the REST access to SMILA. It provides methods for POST, GET and DELETE calls to the REST API and represents the data using SMILA's Any and Attachments using the Attachments interface to be able to work with binary data.

The package org.eclipse.smila.http.client.impl provides a default implementation for the RestClient named DefaultRestClient

Two helper classes exist to provide the resources as described in REST API Reference:

  • ResourceHelper for all not deprecated resources beginning with /smila.
  • TaskManagerClientHelper to provide workers (that are not directly driven by the Workermanager) with resources for the taskhandling (internal taskmanager REST API, i.e. the resources beginning with /taskmanager).

Accessing SMILA

To access SMILA via its REST interface you have to instantiate the Rest Client, like: {{{1}}}

The following code snippet shows how to post a job referring the fileCrawling workflow to the job manager and to start it if posting was successful:

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "crawlCData";
 
// create job description as an AnyMap
final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
jobDescription.put("name", jobName);
jobDescription.put("workflow", "fileCrawling");
final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
parameters.put("tempStore", "temp");
parameters.put("jobToPushTo", "importJob");
parameters.put("dataSource", "file_data");
parameters.put("rootFolder", "c:/data");
jobDescription.put("parameters", parameters);
 
 
// the resourcehelper provides us with the resource to the jobs API
// we send the (AnyMap) job description in the POST body
restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
// POST (here without a body) to start the Job,
// the ResourceHelper provides the resource to the named job
restClient.post(resourceHelper.getJobResource(jobName));

The following snippet would chek if the import job with a given name is already running, if not, start it, and send a record with an attachment to it.

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "indexUpdate";
 
// check for a current run of this job
final AnyMap currentJobRun =
restClient.get(resourceHelper.getJobResource(jobName)).getMap("runs").getMap("current");
if (currentJobRun != null && !currentJobRun.isEmpty()) {
  // a current run exists, so we don't need to start one but it may not be running.
  if (!"RUNNING".equalsIgnoreCase(currentJobRun.getStringValue("state"))) {
    // well it's just an example...
    throw new IllegalStateException("Job '" + jobName + "' is not running but has status '"
	  + currentJobRun.getStringValue("state") + "'.");
  }
} else {
  // no current job run, start another one.
  restClient.post(resourceHelper.getJobResource(jobName));
}
 
// create attachment with a file's content
final File file = new File("c:/data/notice.html");
final Attachments attachments = new AttachmentWrapper("file", file);
// put some sample metadata
final AnyMap metadata = DataFactory.DEFAULT.createAnyMap();
metadata.put("_recordid", "1");
metadata.put("fileName", file.getCanonicalPath());
// now post metadata with an attachment from a file.
// if we had a Record with attachments, we could POST that one...
// note: we could add more than one attachment using the AttachmentWrapper.
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), metadata, attachments);

Using Attachments with the RestClient

As seen above, the rest client bundle provides an Attachments interface in order to allow attachments to be POSTed.

An attachment consists of a String key and binary data that will be POSTED as application/octet-stream in a multipart message.

Handling attachments manually

You can use the AttachmentWrapper in order to add attachments from the following sources if you want to handle attachments manually:

  • a byte[]
  • a String
  • a File
  • an InputStream

There are convenience constructors to provide an attachment when constructing an AttachmentWrapper but you can add more than one attachment and mix the types.

Example:

final RestClient restClient = new DefaultRestClient();
byte[] byteAttachment = new byte[1000];
String stringAttachment = "string attachment";
File fileAttachment = new File("c:/data/notice.html");
InputStream inputStreamAttachment = new FileInputStream(fileAttachment);
 
AttachmentWrapper attachments = new AttachmentWrapper("byte-data", byteAttachment);
attachments.add("string-data", stringAttachment);
attachments.add("file-data", fileAttachment);
attachments.add("stream-data", inputStreamAttachment);
 
restClient.post(resource, parameters, attachments);

Handling attachments with records

SMILA Records can also include attachments, and since SMILA's target data units are Records, it is natural, that the RestClient also supports Records (with attachments) directly.

That means, that the record's metadata will be sent with the Records' attachments as parts of a multipart message.

Example

final byte[] data1 = ...;
final byte[] data2 = ...;
record.setAttachment("data1", data1);
record.setAttachment("data2", data2);
 
// POST the record with the attachments
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), record);


Using the RestClient without the complete development environment

This section describes the steps to follow when using the RestClient from a Java application outside SMILA's JRE.

  • build or download the SMILA distribution
  • Set up a new workspace
  • create a Java project of your gusto
  • Add the following jars from your downloaded/built SMILA application to the Java Build Path of your new project (exact version numbers are ommitted in this list and replaced by *, just use the latest version you'll find in your SMILA application):
    • from the plugins directory:
      • org.apache.commons.collections_*.jar
      • org.apache.commons.io_*.jar
      • org.apache.commons.lang_*.jar
      • org.apache.httpcomponents.httpclient_*.jar (>=4.1)
      • org.apache.httpcomponents.httpcore_*.jar (>=4.1)
      • org.apache.log4j_*.jar
      • org.codehaus.jackson.core_*.jar
      • org.eclipse.smila.datamodel_*.jar
      • org.eclipse.smila.http.client_*.jar
      • org.eclipse.smila.ipc_*.jar
      • org.eclipse.smila.utils_*.jar
    • from the plugins/org.apache.commons.logging_*/lib directory
      • commons-logging-*.jar

Now you have all means to access SMILA's REST API from another Java application.

E.g. you could now write a simple program that creates and starts up a crawl job and the indexUpdate-job:

Note.png
for a smaller example the following code is plainly ignoring exceptions and does not check if jobs are currently existing or running or takes care of other such matters. But it would be a nice exercise for you to add these niceties to the following code fragment.
public class CrawlMyData {
 
	public static void main(String[] args) {
		final RestClient restClient = new DefaultRestClient();
		final ResourceHelper resourceHelper = new ResourceHelper();
		final String jobName = "crawlCData";
 
		// create job description as an AnyMap
		final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
		jobDescription.put("name", jobName);
		jobDescription.put("workflow", "fileCrawling");
		final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
		parameters.put("tempStore", "temp");
		parameters.put("jobToPushTo", "indexUpdate");
		parameters.put("dataSource", "file_data");
		parameters.put("rootFolder", "c:/data");
		jobDescription.put("parameters", parameters);
 
		try {
			// start the referred job "indexUpdate" that indexes our sent data.
			// We should check if it is still be running, etc..
			restClient.post(resourceHelper.getJobResource("indexUpdate"));
		} catch (RestException | IOException e1) {
			// TODO Auto-generated catch block
			e1.printStackTrace();
		}
 
		try {
			// create (or update) the job, we chould check if it exists or is runnung, etc...
			restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
			// POST with no body to start the Job in default mode
			restClient.post(resourceHelper.getJobResource(jobName));
		} catch (RestException | IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
 
	}
}

Putting it together

Now start your SMILA application and when it's up, run the application from above and watch the jobs using your preferred REST client (e.g. browser plugin, see Interactive REST tools) at http://localhost:8080/smila/jobmanager/jobs/

You should see:

  • the newly created job crawlCData
  • the job indexUpdate is RUNNING
  • the job crawlCData is FINISHING (or has already finished, depending on the amount of data in your crawl directory).

Wait some time and you can then search your crawled data at http://localhost:8080/SMILA/search

Links