Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Difference between revisions of "SMILA/Documentation"
< SMILA
m (→JobManager: Asynchronous Workflows) |
m (→Indexing & Searching) |
||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Basics == | == Basics == | ||
− | * [[SMILA/ | + | * [[SMILA/5_Minutes_Tutorial|Installing and Running (5 Minutes Tutorial)]] |
* [[SMILA/Documentation/HowTo|HowTos]] | * [[SMILA/Documentation/HowTo|HowTos]] | ||
* [[SMILA/Documentation/Architecture_Overview|Architecture]] | * [[SMILA/Documentation/Architecture_Overview|Architecture]] | ||
Line 34: | Line 34: | ||
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.boilerpipe|Boilerpipe Pipelet (extract text from HTML content)]] | ** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.boilerpipe|Boilerpipe Pipelet (extract text from HTML content)]] | ||
** [[SMILA/Documentation/TikaPipelet|TikaPipelet (extract text from binary content)]] | ** [[SMILA/Documentation/TikaPipelet|TikaPipelet (extract text from binary content)]] | ||
− | ** [[SMILA/Documentation/JdbcLoggingPipelet|JdbcLoggingPipelet ( | + | ** [[SMILA/Documentation/JdbcLoggingPipelet|JdbcLoggingPipelet, JdbcFetcherPipelet, JdbcSelectPipelet (write/read to/from a database)]] |
** More special pipelets are provided by the components described below. | ** More special pipelets are provided by the components described below. | ||
* Developing new Pipelets | * Developing new Pipelets | ||
Line 40: | Line 40: | ||
** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]] | ** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]] | ||
− | == | + | == Scripting == |
+ | {{note| Available since SMILA 1.3!}} | ||
+ | |||
+ | * [[SMILA/Documentation/Scripting|Scripting SMILA with JavaScript]] | ||
+ | * [[SMILA/Documentation/Scripting/Debugging|Debugging JavaScripts running in SMILA]] | ||
+ | |||
+ | == Indexing & Searching == | ||
+ | |||
+ | * [[SMILA/Documentation/Indexing|Build a Solr search index]] | ||
* [[SMILA/Documentation/Search|Search Processing and APIs]] | * [[SMILA/Documentation/Search|Search Processing and APIs]] | ||
− | * [[SMILA/Documentation/Solr | + | * [[SMILA/Documentation/Solr 4.x|Solr 4 Integration]] |
== JobManager: Asynchronous Workflows == | == JobManager: Asynchronous Workflows == | ||
Line 58: | Line 66: | ||
* Worker Reference | * Worker Reference | ||
** [[SMILA/Documentation/Bulkbuilder|Bulkbuilder worker]] | ** [[SMILA/Documentation/Bulkbuilder|Bulkbuilder worker]] | ||
− | ** [[SMILA/Documentation/Worker/PipelineProcessorWorker| | + | ** [[SMILA/Documentation/Worker/ScriptProcessorWorker|ScriptProcessor Worker]] |
+ | ** [[SMILA/Documentation/Worker/PipelineProcessorWorker|PipelineProcessor Worker]] | ||
** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]] | ** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]] | ||
** (see [[SMILA/Manual#Importing|Importing]] section for more workers) | ** (see [[SMILA/Manual#Importing|Importing]] section for more workers) | ||
Line 72: | Line 81: | ||
**[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]] | **[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]] | ||
**[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]] | **[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]] | ||
+ | *** [[SMILA/Documentation/Importing/CrawlingMultipleStartURLs | Crawling multiple start URLs in one job run]] | ||
**[[SMILA/Documentation/Importing/Crawler/JDBC | JdbcCrawler and JdbcFetcher Worker]] | **[[SMILA/Documentation/Importing/Crawler/JDBC | JdbcCrawler and JdbcFetcher Worker]] | ||
**[[SMILA/Documentation/Importing/Crawler/Feed | FeedCrawler Worker]] | **[[SMILA/Documentation/Importing/Crawler/Feed | FeedCrawler Worker]] | ||
Line 82: | Line 92: | ||
* Additionally | * Additionally | ||
** [[SMILA/Documentation/Importing/RemoteCrawling | Remote Crawling]] | ** [[SMILA/Documentation/Importing/RemoteCrawling | Remote Crawling]] | ||
− | |||
== Embedded HTTP Server == | == Embedded HTTP Server == | ||
Line 106: | Line 115: | ||
* [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]] | * [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]] | ||
* [[SMILA/Documentation/SMILA_Versioning|SMILA Version Information]] | * [[SMILA/Documentation/SMILA_Versioning|SMILA Version Information]] | ||
+ | * [[SMILA/Documentation/Adding_JDBC_Drivers|Adding JDBC Drivers]] | ||
== Deprecated Components == | == Deprecated Components == |
Latest revision as of 03:25, 13 April 2015
Contents
Basics
- Installing and Running (5 Minutes Tutorial)
- HowTos
- Architecture
- Overview of Default Configuration
- Data Model, XML, JSON, BON
- Using the REST API, REST Client
- REST API Reference
- Enabling Remote Access to SMILA
Development Environment
- Setting up your Eclipse IDE for SMILA
- Building SMILA
- Creating new Components
- Testing new Components
- Adding Third Party Libraries to SMILA
- Using OSGi Declarative Services
- Extending the build process:
Pipelines and Pipelets: Synchronous Workflows
- What are Pipelines? What are Pipelets?
- Configuring and Creating BPEL Pipelines
- Using the SMILA BPEL Designer
- ReST APIs
- Basic Pipelets
- Common Pipelets
- XML Processing Pipelets
- Boilerpipe Pipelet (extract text from HTML content)
- TikaPipelet (extract text from binary content)
- JdbcLoggingPipelet, JdbcFetcherPipelet, JdbcSelectPipelet (write/read to/from a database)
- More special pipelets are provided by the components described below.
- Developing new Pipelets
Scripting
Indexing & Searching
JobManager: Asynchronous Workflows
- What are Jobs and Tasks?
- Creating Workflows and Jobs
- Running and Monitoring Jobs
- Configuring the Job Manager
- TaskManager: Asynchronous Scheduling of Tasks
- Worker Reference
- Bulkbuilder worker
- ScriptProcessor Worker
- PipelineProcessor Worker
- PipeletProcessor Worker
- (see Importing section for more workers)
- Developing new Workers
Importing
- Concepts, Workflow and Components
- Reference of Import Workers
- Developing new Import Workers
- Additionally
Embedded HTTP Server
Common Services
- Configuration Helper
- Workspace Helper
- ClusterConfig Service
- Zookeeper Service
- ObjectStore
- BinaryStorage
- Processing Security Information
- Ontology Processing with Sesame: Configuration and Pipelets
- MimeTypeIdentifier
- Description of Worker and Pipelet Parameters
- Publishing Web Services
- General JPA Configuration in SMILA
- SMILA Version Information
- Adding JDBC Drivers