Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SMILA/Documentation/Indexing

< SMILA‎ | Documentation
Revision as of 07:30, 10 April 2015 by Andreas.weber.empolis.com (Talk | contribs) (Add data to a Solr search index)

Add data to a Solr search index

SMILA comes with predefined workflows and jobs to add/delete data from/to a solr search index.

As described in the 5 Minutes tutorial there are separate predefined jobs for importing the data (crawl jobs) and indexing the data.

The common predefined job for indexing is "indexUpdate". It uses the ScriptProcessorWorker that executes JavaScript for inserting (add.js) and deleting (delete.js) records to the predefined solr search index ("collection1").

The JavaScripts can also be called directly from the REST API to add/delete data records, e.g.

Add a record to the solr index:

POST http://localhost:8080/smila/script/add.process
{
  "_recordid": "id1",
  "Title": "Scripting rules!",
  "Content": "yet another SMILA document",
  "MimeType": "text/plain"
}

Delete record from solr index:

POST http://localhost:8080/smila/script/delete.process
{
  "_recordid": "id1"
}


  • For more details about the "indexUpdate" workflow and "indexUpdate" job definitions see SMILA/configuration/org.eclipse.smila.jobmanager/workflows.json and jobs.json).
  • For more information about job management in general please check the JobManager documentation.
  • For more information about script processing with JavaScripts check the [Scripting documentation].

Latency vs. Throughput

The predefined add/delete scripts are set for a small latency. Therefore the solr commit interval is set to 1 sec. via the SolrUpdatePipelet's commitWithinMs parameter.

If you want to process a high amount of data, set the commitWithinMs to a greater value. This will result in a better throuhput.

Back to the top