Jump to: navigation, search

Difference between revisions of "SMILA/Documentation"

m (JobManager: Asynchronous Workflows)
m (Indexing & Searching)
 
(14 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
== Basics ==
 
== Basics ==
* [[SMILA/Documentation_for_5_Minutes_to_Success|Installing and Running]]
+
* [[SMILA/5_Minutes_Tutorial|Installing and Running (5 Minutes Tutorial)]]
 
* [[SMILA/Documentation/HowTo|HowTos]]
 
* [[SMILA/Documentation/HowTo|HowTos]]
 
* [[SMILA/Documentation/Architecture_Overview|Architecture]]
 
* [[SMILA/Documentation/Architecture_Overview|Architecture]]
Line 34: Line 34:
 
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.boilerpipe|Boilerpipe Pipelet (extract text from HTML content)]]
 
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.boilerpipe|Boilerpipe Pipelet (extract text from HTML content)]]
 
** [[SMILA/Documentation/TikaPipelet|TikaPipelet (extract text from binary content)]]
 
** [[SMILA/Documentation/TikaPipelet|TikaPipelet (extract text from binary content)]]
** [[SMILA/Documentation/JdbcLoggingPipelet|JdbcLoggingPipelet (log to a database)]]
+
** [[SMILA/Documentation/JdbcLoggingPipelet|JdbcLoggingPipelet, JdbcFetcherPipelet, JdbcSelectPipelet (write/read to/from a database)]]
 
** More special pipelets are provided by the components described below.
 
** More special pipelets are provided by the components described below.
 
* Developing new Pipelets  
 
* Developing new Pipelets  
Line 40: Line 40:
 
** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]]
 
** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]]
  
== Searching ==
+
== Scripting ==  
  
 +
{{note| Available since SMILA 1.3!}}
 +
 +
* [[SMILA/Documentation/Scripting|Scripting SMILA with JavaScript]]
 +
* [[SMILA/Documentation/Scripting/Debugging|Debugging JavaScripts running in SMILA]]
 +
 +
== Indexing & Searching ==
 +
 +
* [[SMILA/Documentation/Indexing|Build a Solr search index]]
 
* [[SMILA/Documentation/Search|Search Processing and APIs]]
 
* [[SMILA/Documentation/Search|Search Processing and APIs]]
* [[SMILA/Documentation/Solr 3.5|Solr Integration: Configuration and Pipelets]]
+
* [[SMILA/Documentation/Solr 4.x|Solr 4 Integration]]
  
 
== JobManager: Asynchronous Workflows ==
 
== JobManager: Asynchronous Workflows ==
Line 58: Line 66:
 
* Worker Reference
 
* Worker Reference
 
** [[SMILA/Documentation/Bulkbuilder|Bulkbuilder worker]]
 
** [[SMILA/Documentation/Bulkbuilder|Bulkbuilder worker]]
** [[SMILA/Documentation/Worker/PipelineProcessorWorker|PipelineProcesor Worker]]
+
** [[SMILA/Documentation/Worker/ScriptProcessorWorker|ScriptProcessor Worker]]
 +
** [[SMILA/Documentation/Worker/PipelineProcessorWorker|PipelineProcessor Worker]]
 
** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]]
 
** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]]
 
** (see [[SMILA/Manual#Importing|Importing]] section for more workers)
 
** (see [[SMILA/Manual#Importing|Importing]] section for more workers)
Line 72: Line 81:
 
**[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]]
 +
*** [[SMILA/Documentation/Importing/CrawlingMultipleStartURLs | Crawling multiple start URLs in one job run]]
 
**[[SMILA/Documentation/Importing/Crawler/JDBC | JdbcCrawler and JdbcFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/JDBC | JdbcCrawler and JdbcFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Feed | FeedCrawler Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Feed | FeedCrawler Worker]]
Line 82: Line 92:
 
* Additionally
 
* Additionally
 
** [[SMILA/Documentation/Importing/RemoteCrawling | Remote Crawling]]
 
** [[SMILA/Documentation/Importing/RemoteCrawling | Remote Crawling]]
** [[SMILA/Documentation/Importing/CrawlingMultipleStartURLs | Crawling multiple start URL in one job run]]
 
  
 
== Embedded HTTP Server ==
 
== Embedded HTTP Server ==
Line 106: Line 115:
 
* [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]]
 
* [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]]
 
* [[SMILA/Documentation/SMILA_Versioning|SMILA Version Information]]
 
* [[SMILA/Documentation/SMILA_Versioning|SMILA Version Information]]
 +
* [[SMILA/Documentation/Adding_JDBC_Drivers|Adding JDBC Drivers]]
  
 
== Deprecated Components ==
 
== Deprecated Components ==

Latest revision as of 03:25, 13 April 2015

Basics

Development Environment

Pipelines and Pipelets: Synchronous Workflows

Scripting

Note.png
Available since SMILA 1.3!


Indexing & Searching

JobManager: Asynchronous Workflows

Importing

Embedded HTTP Server

Common Services

Deprecated Components