Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/HowTo/How to access the REST API with the RestClient"

m (Using Attachments with the RestClient)
m (Interfaces and default implementations)
(21 intermediate revisions by 3 users not shown)
Line 4: Line 4:
  
 
= Preconditions =
 
= Preconditions =
For the sake of simplicity, we assume that you check out the complete SMILA development environment, although it would be sufficient to just check out the relevant bundles to be able to access SMILA. This would be the case, to given some examples, for an asynchronous SMILA worker running in a JRE different to SMILA's JRE or a testing application accessing SMILA via the REST API etc.
+
For the sake of simplicity, we assume that you check out the complete SMILA development environment, although it would be sufficient to just [[#Using_the_RestClient_without_the_complete_development_environment|check out the relevant bundles]] to be able to access SMILA. This would be the case, to given some examples, for an asynchronous SMILA worker running in a JRE different to SMILA's JRE or a testing application accessing SMILA via the REST API etc.
  
 
* Set up your development environment, see [[SMILA/Development Guidelines/Howto set up dev environment|How to set up the development environment]].
 
* Set up your development environment, see [[SMILA/Development Guidelines/Howto set up dev environment|How to set up the development environment]].
Line 10: Line 10:
  
 
= Basics =
 
= Basics =
 +
 +
The following examples and code snippets all apply when you are running SMILA out-of-the box on <tt>localhost</tt>.
 +
 +
If you are running SMILA on a different host or with a different port (or an altered root context), please see [[#non-default_configuration|non-default configuration]] on how to use the Rest Client in these cases.
  
 
== Interfaces and default implementations ==
 
== Interfaces and default implementations ==
The {{code|RestClient}} interface encapsulates the REST access to SMILA. It provides methods for POST, GET and DELETE calls to the REST API and represents data using SMILA's {{code|Any}} interface and [[SMILA/Glossary#A|attachments]] using the {{code|Attachments}} interface. The latter allow working with binary data in SMILA.
+
The {{code|RestClient}} interface encapsulates the REST access to SMILA. It provides methods for GET, POST, PUT and DELETE calls to the REST API and represents data using SMILA's {{code|Any}} interface and [[SMILA/Glossary#A|attachments]] using the {{code|Attachments}} interface. The latter allow working with binary data in SMILA.
  
 
The package <tt>org.eclipse.smila.http.client.impl</tt> provides a default implementation for the RestClient named {{code|DefaultRestClient}}.
 
The package <tt>org.eclipse.smila.http.client.impl</tt> provides a default implementation for the RestClient named {{code|DefaultRestClient}}.
 +
 +
Another implementation is provided in package <tt>org.eclipse.smila.http.client.impl.failover</tt> named {{code|FailoverRestClient}}. It can be created with a list of several SMILA host addresses. Usually, it tries to talk to the first of those hosts. If this node cannot be reached anymore (because SMILA has crashed or there is a network failure), this client will retry a request on the next node until it could be executed on one node, or all nodes have been tried.
  
 
There are two helper classes providing the resources as described in [[SMILA/Documentation/REST_API_Reference|REST API Reference]]:
 
There are two helper classes providing the resources as described in [[SMILA/Documentation/REST_API_Reference|REST API Reference]]:
Line 88: Line 94:
  
 
= Using Attachments with the RestClient =
 
= Using Attachments with the RestClient =
As seen above, the <tt>RestClient</tt> bundle provides an {{code|Attachments}} interface allowing attachments to be POSTed.
+
As seen above, the <tt>RestClient</tt> bundle provides an {{code|Attachments}} interface allowing attachments to be POSTed. An attachment consists of a string key and binary data that will be POSTed as <tt>application/octet-stream</tt> in a multi-part message.
 
+
An attachment consists of a string key and binary data that will be POSTed as <tt>application/octet-stream</tt> in a multi-part message.
+
  
 
== Handling attachments manually ==
 
== Handling attachments manually ==
Line 118: Line 122:
  
 
== Handling attachments with records ==
 
== Handling attachments with records ==
SMILA Records can also include attachments, and since SMILA's target data units are Records, it is natural, that the RestClient also supports Records (with attachments) directly.
+
SMILA records can also include attachments, and since SMILA's target data units are records, it is natural, that the RestClient also supports records (with attachments) directly.
  
That means, that the record's metadata will be sent with the Records' attachments as parts of a multipart message.
+
That means, the record's metadata will be sent with the records' attachments as parts of a multi-part message.
  
Example
+
Example:
 
<source lang="java">
 
<source lang="java">
 
final byte[] data1 = ...;
 
final byte[] data1 = ...;
Line 136: Line 140:
 
This section describes the steps to follow when using the RestClient from a Java application outside SMILA's JRE.
 
This section describes the steps to follow when using the RestClient from a Java application outside SMILA's JRE.
  
* build or download the SMILA distribution
+
# Build or download the SMILA distribution.
* Set up a new workspace
+
# Create a new workspace.
* create a Java project of your gusto
+
# Create a Java project of your gusto.
* Add the following jars from your downloaded/built SMILA application to the Java Build Path of your new project (exact version numbers are ommitted in this list and replaced by <tt>*</tt>, just use the latest version you'll find in your SMILA application):
+
# Add the following JARs from your downloaded/built SMILA application to the Java Build Path of your new project (exact version numbers are omitted in this list and replaced with <tt>*</tt>, just use the latest version you'll find in your SMILA application):
** from the plugins directory:
+
#* from the <tt>plugins</tt> directory:
*** org.apache.commons.collections_*.jar
+
#** org.apache.commons.collections_*.jar
*** org.apache.commons.io_*.jar
+
#** org.apache.commons.io_*.jar
*** org.apache.commons.lang_*.jar
+
#** org.apache.commons.lang_*.jar
*** org.apache.httpcomponents.httpclient_*.jar (>=4.1)
+
#** org.apache.httpcomponents.httpclient_*.jar (>=4.1)
*** org.apache.httpcomponents.httpcore_*.jar (>=4.1)
+
#** org.apache.httpcomponents.httpcore_*.jar (>=4.1)
*** org.apache.log4j_*.jar
+
#** org.apache.log4j_*.jar
*** org.codehaus.jackson.core_*.jar
+
#** org.codehaus.jackson.core_*.jar
*** org.eclipse.smila.datamodel_*.jar
+
#** org.eclipse.smila.datamodel_*.jar
*** org.eclipse.smila.http.client_*.jar
+
#** org.eclipse.smila.http.client_*.jar
*** org.eclipse.smila.ipc_*.jar
+
#** org.eclipse.smila.ipc_*.jar
*** org.eclipse.smila.utils_*.jar
+
#** org.eclipse.smila.utils_*.jar
** from the plugins/org.apache.commons.logging_*/lib directory
+
#* from the <tt>plugins/org.apache.commons.logging_*/lib</tt> directory
*** commons-logging-*.jar
+
#** commons-logging-*.jar  
  
 
Now you have all means to access SMILA's REST API from another Java application.
 
Now you have all means to access SMILA's REST API from another Java application.
  
 
E.g. you could now write a simple program that creates and starts up a crawl job and the indexUpdate-job:
 
E.g. you could now write a simple program that creates and starts up a crawl job and the indexUpdate-job:
{{note|for a smaller example the following code is plainly ignoring exceptions and does not check if jobs are currently existing or running or takes care of other such matters. But it would be a nice exercise for you to add these niceties to the following code fragment.}}
+
 
 +
{{note|To keep the example simple, it plainly ignores exceptions and does not check if a job exists or is running or other such matters. Feel free to add these niceties to your own code as an exercise, if you like.}}
 
<source lang="java">
 
<source lang="java">
 
public class CrawlMyData {
 
public class CrawlMyData {
Line 182: Line 187:
 
// We should check if it is still be running, etc..
 
// We should check if it is still be running, etc..
 
restClient.post(resourceHelper.getJobResource("indexUpdate"));
 
restClient.post(resourceHelper.getJobResource("indexUpdate"));
} catch (RestException | IOException e1) {
+
} catch (RestException e) {
 
// TODO Auto-generated catch block
 
// TODO Auto-generated catch block
e1.printStackTrace();
+
e.printStackTrace();
 +
} catch (IOException e) {
 +
// TODO Auto-generated catch block
 +
e.printStackTrace();
 
}
 
}
 
 
Line 193: Line 201:
 
// POST with no body to start the Job in default mode
 
// POST with no body to start the Job in default mode
 
restClient.post(resourceHelper.getJobResource(jobName));
 
restClient.post(resourceHelper.getJobResource(jobName));
} catch (RestException | IOException e) {
+
} catch (RestException e) {
 +
// TODO Auto-generated catch block
 +
e.printStackTrace();
 +
} catch (IOException e) {
 
// TODO Auto-generated catch block
 
// TODO Auto-generated catch block
 
e.printStackTrace();
 
e.printStackTrace();
Line 203: Line 214:
  
 
== Putting it together ==
 
== Putting it together ==
Now start your SMILA application and when it's up, run the application from above and watch the jobs using your preferred REST client (e.g. browser plugin, see [[SMILA/Documentation/Using_The_ReST_API#Interactive_Tools|Interactive REST tools]]) at http://localhost:8080/smila/jobmanager/jobs/
+
Now start your SMILA application and when it's up, run the application from above and watch the jobs using your preferred REST client (e.g. browser plugin, see [[SMILA/Documentation/Using_The_ReST_API#Interactive_Tools|Interactive REST tools]]) at <tt>http://localhost:8080/smila/jobmanager/jobs/</tt>.
  
 
You should see:
 
You should see:
* the newly created job <tt>crawlCData</tt>
+
* the newly created job "crawlCData",
* the job <tt>indexUpdate</tt> is RUNNING
+
* the job "indexUpdate" is RUNNING,
* the job <tt>crawlCData</tt> is FINISHING (or has already finished, depending on the amount of data in your crawl directory).
+
* the job "crawlCData" is FINISHING (or has already finished, depending on the amount of data in your crawled directory).
 +
 
 +
Wait a bit and you can search your crawled data at <tt>http://localhost:8080/SMILA/search</tt>.
 +
 
 +
= Using non-default configuration =
 +
SMILA's RestClient and ResourceHelper have default constructors using the standard values for the SMILA application. These are:
 +
* Host: localhost
 +
* Port: 8080
 +
* Root context: /smila
 +
 
 +
If your bundle runs under a different root context path, you have to create your {{code|ResourceHelper}} using the actual context path. Also, if your application runs on a different server and/or uses a different port, you will have to supply this information to the constructor of the {{code|DefaultRestClient}} (you can omit the leading <tt>http://</tt>).
 +
 
 +
E.g. the following code snippet:
 +
<source lang="java">
 +
final RestClient restClient = new DefaultRestClient("host.domain.org:80");
 +
final ResourceHelper resourceHelper = new ResourceHelper("/context");
 +
</source>
 +
creates a RestClient and a ResourceHelper connecting to a SMILA instance running on <nowiki>http://host.domain.org:80/context</nowiki>.
  
Wait some time and you can then search your crawled data at http://localhost:8080/SMILA/search
+
You can also use your own connection manager or limit the number of total connections and max connections per host by using the respective constructors of {{code|DefaultRestClient}}.
  
 
= Links =
 
= Links =

Revision as of 02:50, 27 March 2012

SMILA provides an extensive REST API to control SMILA, check the status, import or search data, attach workers to its job control etc.

This HowTo describes how to utilize the included RestClient to access SMILA's REST API from within a Java application.

Preconditions

For the sake of simplicity, we assume that you check out the complete SMILA development environment, although it would be sufficient to just check out the relevant bundles to be able to access SMILA. This would be the case, to given some examples, for an asynchronous SMILA worker running in a JRE different to SMILA's JRE or a testing application accessing SMILA via the REST API etc.

Basics

The following examples and code snippets all apply when you are running SMILA out-of-the box on localhost.

If you are running SMILA on a different host or with a different port (or an altered root context), please see non-default configuration on how to use the Rest Client in these cases.

Interfaces and default implementations

The RestClient interface encapsulates the REST access to SMILA. It provides methods for GET, POST, PUT and DELETE calls to the REST API and represents data using SMILA's Any interface and attachments using the Attachments interface. The latter allow working with binary data in SMILA.

The package org.eclipse.smila.http.client.impl provides a default implementation for the RestClient named DefaultRestClient.

Another implementation is provided in package org.eclipse.smila.http.client.impl.failover named FailoverRestClient. It can be created with a list of several SMILA host addresses. Usually, it tries to talk to the first of those hosts. If this node cannot be reached anymore (because SMILA has crashed or there is a network failure), this client will retry a request on the next node until it could be executed on one node, or all nodes have been tried.

There are two helper classes providing the resources as described in REST API Reference:

  • ResourceHelper for all resources beginning with /smila, except for those that are marked as deprecated in the REST API Reference.
  • TaskManagerClientHelper to provide workers that are not directly driven by the WorkerManager with resources for task handling (internal TaskManager REST API, i.e. the resources beginning with /taskmanager).

Accessing SMILA

To access SMILA via its REST interface, instantiate the RestClient, like:

RestClient restClient = new DefaultRestClient();

The following code snippet creates a job definition, sends it to the JobManager and starts it if posting was successful:

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "crawlCData";
 
// create job description as an AnyMap
final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
jobDescription.put("name", jobName);
jobDescription.put("workflow", "fileCrawling");
final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
parameters.put("tempStore", "temp");
parameters.put("jobToPushTo", "importJob");
parameters.put("dataSource", "file_data");
parameters.put("rootFolder", "c:/data");
jobDescription.put("parameters", parameters);
 
 
// the resourcehelper provides us with the resource to the jobs API
// we send the (AnyMap) job description in the POST body
restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
// POST (here without a body) to start the Job,
// the ResourceHelper provides the resource to the named job
restClient.post(resourceHelper.getJobResource(jobName));

The following snippet checks if the job with the given name is already running, if not, it is started, and a record with an attachment is sent to it.

final RestClient restClient = new DefaultRestClient();
final ResourceHelper resourceHelper = new ResourceHelper();
final String jobName = "indexUpdate";
 
// check for a current run of this job
final AnyMap currentJobRun =
restClient.get(resourceHelper.getJobResource(jobName)).getMap("runs").getMap("current");
if (currentJobRun != null && !currentJobRun.isEmpty()) {
  // a current run exists, so we don't need to start one but it may not be running.
  if (!"RUNNING".equalsIgnoreCase(currentJobRun.getStringValue("state"))) {
    // well it's just an example...
    throw new IllegalStateException("Job '" + jobName + "' is not running but has status '"
	  + currentJobRun.getStringValue("state") + "'.");
  }
} else {
  // no current job run, start another one.
  restClient.post(resourceHelper.getJobResource(jobName));
}
 
// create attachment with a file's content
final File file = new File("c:/data/notice.html");
final Attachments attachments = new AttachmentWrapper("file", file);
// put some sample metadata
final AnyMap metadata = DataFactory.DEFAULT.createAnyMap();
metadata.put("_recordid", "1");
metadata.put("fileName", file.getCanonicalPath());
// now post metadata with an attachment from a file.
// if we had a Record with attachments, we could POST that one...
// note: we could add more than one attachment using the AttachmentWrapper.
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), metadata, attachments);

Using Attachments with the RestClient

As seen above, the RestClient bundle provides an Attachments interface allowing attachments to be POSTed. An attachment consists of a string key and binary data that will be POSTed as application/octet-stream in a multi-part message.

Handling attachments manually

You can use the AttachmentWrapper in order to add attachments from the following sources if you want to handle attachments manually:

  • a byte[]
  • a String
  • a File
  • an InputStream

There are convenience constructors to provide an attachment when constructing an AttachmentWrapper but you can add more than one attachment and mix the types.

Example:

final RestClient restClient = new DefaultRestClient();
byte[] byteAttachment = new byte[1000];
String stringAttachment = "string attachment";
File fileAttachment = new File("c:/data/notice.html");
InputStream inputStreamAttachment = new FileInputStream(fileAttachment);
 
AttachmentWrapper attachments = new AttachmentWrapper("byte-data", byteAttachment);
attachments.add("string-data", stringAttachment);
attachments.add("file-data", fileAttachment);
attachments.add("stream-data", inputStreamAttachment);
 
restClient.post(resource, parameters, attachments);

Handling attachments with records

SMILA records can also include attachments, and since SMILA's target data units are records, it is natural, that the RestClient also supports records (with attachments) directly.

That means, the record's metadata will be sent with the records' attachments as parts of a multi-part message.

Example:

final byte[] data1 = ...;
final byte[] data2 = ...;
record.setAttachment("data1", data1);
record.setAttachment("data2", data2);
 
// POST the record with the attachments
restClient.post(resourceHelper.getPushRecordToJobResource(jobName), record);

Using the RestClient without the complete development environment

This section describes the steps to follow when using the RestClient from a Java application outside SMILA's JRE.

  1. Build or download the SMILA distribution.
  2. Create a new workspace.
  3. Create a Java project of your gusto.
  4. Add the following JARs from your downloaded/built SMILA application to the Java Build Path of your new project (exact version numbers are omitted in this list and replaced with *, just use the latest version you'll find in your SMILA application):
    • from the plugins directory:
      • org.apache.commons.collections_*.jar
      • org.apache.commons.io_*.jar
      • org.apache.commons.lang_*.jar
      • org.apache.httpcomponents.httpclient_*.jar (>=4.1)
      • org.apache.httpcomponents.httpcore_*.jar (>=4.1)
      • org.apache.log4j_*.jar
      • org.codehaus.jackson.core_*.jar
      • org.eclipse.smila.datamodel_*.jar
      • org.eclipse.smila.http.client_*.jar
      • org.eclipse.smila.ipc_*.jar
      • org.eclipse.smila.utils_*.jar
    • from the plugins/org.apache.commons.logging_*/lib directory
      • commons-logging-*.jar

Now you have all means to access SMILA's REST API from another Java application.

E.g. you could now write a simple program that creates and starts up a crawl job and the indexUpdate-job:

Note.png
To keep the example simple, it plainly ignores exceptions and does not check if a job exists or is running or other such matters. Feel free to add these niceties to your own code as an exercise, if you like.
public class CrawlMyData {
 
	public static void main(String[] args) {
		final RestClient restClient = new DefaultRestClient();
		final ResourceHelper resourceHelper = new ResourceHelper();
		final String jobName = "crawlCData";
 
		// create job description as an AnyMap
		final AnyMap jobDescription = DataFactory.DEFAULT.createAnyMap();
		jobDescription.put("name", jobName);
		jobDescription.put("workflow", "fileCrawling");
		final AnyMap parameters = DataFactory.DEFAULT.createAnyMap();
		parameters.put("tempStore", "temp");
		parameters.put("jobToPushTo", "indexUpdate");
		parameters.put("dataSource", "file_data");
		parameters.put("rootFolder", "c:/data");
		jobDescription.put("parameters", parameters);
 
		try {
			// start the referred job "indexUpdate" that indexes our sent data.
			// We should check if it is still be running, etc..
			restClient.post(resourceHelper.getJobResource("indexUpdate"));
		} catch (RestException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
 
		try {
			// create (or update) the job, we chould check if it exists or is runnung, etc...
			restClient.post(resourceHelper.getJobsResource(), jobDescription);
 
			// POST with no body to start the Job in default mode
			restClient.post(resourceHelper.getJobResource(jobName));
		} catch (RestException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
 
	}
}

Putting it together

Now start your SMILA application and when it's up, run the application from above and watch the jobs using your preferred REST client (e.g. browser plugin, see Interactive REST tools) at http://localhost:8080/smila/jobmanager/jobs/.

You should see:

  • the newly created job "crawlCData",
  • the job "indexUpdate" is RUNNING,
  • the job "crawlCData" is FINISHING (or has already finished, depending on the amount of data in your crawled directory).

Wait a bit and you can search your crawled data at http://localhost:8080/SMILA/search.

Using non-default configuration

SMILA's RestClient and ResourceHelper have default constructors using the standard values for the SMILA application. These are:

  • Host: localhost
  • Port: 8080
  • Root context: /smila

If your bundle runs under a different root context path, you have to create your ResourceHelper using the actual context path. Also, if your application runs on a different server and/or uses a different port, you will have to supply this information to the constructor of the DefaultRestClient (you can omit the leading http://).

E.g. the following code snippet:

final RestClient restClient = new DefaultRestClient("host.domain.org:80");
final ResourceHelper resourceHelper = new ResourceHelper("/context");

creates a RestClient and a ResourceHelper connecting to a SMILA instance running on http://host.domain.org:80/context.

You can also use your own connection manager or limit the number of total connections and max connections per host by using the respective constructors of DefaultRestClient.

Links