Jump to: navigation, search

Difference between revisions of "SMILA/Documentation"

m (Pipelines and Pipelets: Synchronous Workflows of Java components)
m (JobManager: Asynchronous Workflows)
(34 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
== Basics ==
 
== Basics ==
 +
* [[SMILA/Documentation_for_5_Minutes_to_Success|Installing and Running]]
 +
* [[SMILA/Documentation/HowTo|HowTos]]
 
* [[SMILA/Documentation/Architecture_Overview|Architecture]]
 
* [[SMILA/Documentation/Architecture_Overview|Architecture]]
 
* [[SMILA/Documentation/Default_configuration_workflow_overview|Overview of Default Configuration]]
 
* [[SMILA/Documentation/Default_configuration_workflow_overview|Overview of Default Configuration]]
 
* [[SMILA/Documentation/Data_Model_and_Serialization_Formats|Data Model, XML, JSON, BON]]
 
* [[SMILA/Documentation/Data_Model_and_Serialization_Formats|Data Model, XML, JSON, BON]]
* [[SMILA/Documentation/Using_The_ReST_API|Using the ReST API]]
+
* [[SMILA/Documentation/Using_The_ReST_API|Using the REST API, REST Client]]
 
* [[SMILA/Documentation/REST_API_Reference|REST API Reference]]
 
* [[SMILA/Documentation/REST_API_Reference|REST API Reference]]
 
* [[SMILA/Documentation/Enable Remote Access|Enabling Remote Access to SMILA]]
 
* [[SMILA/Documentation/Enable Remote Access|Enabling Remote Access to SMILA]]
  
== The SMILA Development Environment ==
+
== Development Environment ==
 
* [[SMILA/Documentation/HowTo/Howto_set_up_dev_environment|Setting up your Eclipse IDE for SMILA]]
 
* [[SMILA/Documentation/HowTo/Howto_set_up_dev_environment|Setting up your Eclipse IDE for SMILA]]
 
* [[SMILA/Documentation/HowTo/Howto_build_a_SMILA-Distribution|Building SMILA]]
 
* [[SMILA/Documentation/HowTo/Howto_build_a_SMILA-Distribution|Building SMILA]]
Line 18: Line 20:
 
** [[SMILA/Documentation/HowTo/How_to_integrate_test_bundle_into_build_process|Adding a new Test Bundle to the Build]]
 
** [[SMILA/Documentation/HowTo/How_to_integrate_test_bundle_into_build_process|Adding a new Test Bundle to the Build]]
  
== Pipelines and Pipelets: Synchronous Workflows of Java Components ==
+
== Pipelines and Pipelets: Synchronous Workflows ==
 
* [[SMILA/Documentation/Pipelets|What are Pipelines? What are Pipelets?]]
 
* [[SMILA/Documentation/Pipelets|What are Pipelines? What are Pipelets?]]
 
* [[SMILA/Documentation/BPEL_Workflow_Processor|Configuring and Creating BPEL Pipelines]]
 
* [[SMILA/Documentation/BPEL_Workflow_Processor|Configuring and Creating BPEL Pipelines]]
Line 28: Line 30:
 
** [[SMILA/Documentation/Processing/JSON_REST_API_for_BPEL_pipelines|Creating, Editing, and Executing Pipelines]]
 
** [[SMILA/Documentation/Processing/JSON_REST_API_for_BPEL_pipelines|Creating, Editing, and Executing Pipelines]]
 
* Basic Pipelets
 
* Basic Pipelets
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets|Common Pipelets in Bundle org.eclipse.smila.processing.pipelets]]  
+
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets|Common Pipelets]]
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.xmlprocessing|XML Processing Pipelets in Bundle org.eclipse.smila.processing.pipelets.xmlprocessing]]
+
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.xmlprocessing|XML Processing Pipelets]]
 +
** [[SMILA/Documentation/Bundle org.eclipse.smila.processing.pipelets.boilerpipe|Boilerpipe Pipelet (extract text from HTML content)]]
 +
** [[SMILA/Documentation/TikaPipelet|TikaPipelet (extract text from binary content)]]
 +
** [[SMILA/Documentation/JdbcLoggingPipelet|JdbcLoggingPipelet (log to a database)]]
 
** More special pipelets are provided by the components described below.
 
** More special pipelets are provided by the components described below.
 
* Developing new Pipelets  
 
* Developing new Pipelets  
Line 35: Line 40:
 
** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]]
 
** [[SMILA/Documentation/Usage_of_Blackboard_Service|Using the Blackboard Service]]
  
== Using SMILA for Search ==
+
== Searching ==
  
 
* [[SMILA/Documentation/Search|Search Processing and APIs]]
 
* [[SMILA/Documentation/Search|Search Processing and APIs]]
* [[SMILA/Documentation/Solr|Solr Integration: Configuration and Pipelets]]
+
* [[SMILA/Documentation/Solr 3.5|Solr Integration: Configuration and Pipelets]]
  
 
== JobManager: Asynchronous Workflows ==
 
== JobManager: Asynchronous Workflows ==
 
* [[SMILA/Documentation/JobManager|What are Jobs and Tasks?]]
 
* [[SMILA/Documentation/JobManager|What are Jobs and Tasks?]]
** [[SMILA/Documentation/JobManagerFirstExample|A Simple Example]]
+
** [[SMILA/Documentation/JobManagerFirstExample|JobManager Walk-Through]]
 
* Creating Workflows and Jobs
 
* Creating Workflows and Jobs
** [[SMILA/Documentation/DataObjectTypesAndBuckets|Defining Buckets]]
+
** [[SMILA/Documentation/DataObjectTypesAndBuckets|Data Object Types and Buckets]]
** [[SMILA/Documentation/WorkerAndWorkflows|Modeling Workflows]]
+
** [[SMILA/Documentation/WorkerAndWorkflows|Workers and Workflows]]
** [[SMILA/Documentation/JobDefinitions|Creating Jobs]]
+
** [[SMILA/Documentation/JobDefinitions|Jobs]]
** [[SMILA/Documentation/JobParameters|Evaluating of Job Parameters]]
+
** [[SMILA/Documentation/JobParameters|Job Parameters]]
 
* [[SMILA/Documentation/JobRuns|Running and Monitoring Jobs]]
 
* [[SMILA/Documentation/JobRuns|Running and Monitoring Jobs]]
 
* [[SMILA/Documentation/JobManagerConfiguration|Configuring the Job Manager]]
 
* [[SMILA/Documentation/JobManagerConfiguration|Configuring the Job Manager]]
Line 55: Line 60:
 
** [[SMILA/Documentation/Worker/PipelineProcessorWorker|PipelineProcesor Worker]]
 
** [[SMILA/Documentation/Worker/PipelineProcessorWorker|PipelineProcesor Worker]]
 
** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]]
 
** [[SMILA/Documentation/Worker/PipeletProcessorWorker|PipeletProcessor Worker]]
** See [[SMILA/Manual#Importing|Importing]] below for more workers
+
** (see [[SMILA/Manual#Importing|Importing]] section for more workers)
* Developing Workers
+
* Developing new Workers
** [[SMILA/Documentation/WorkerManager|WorkerManager: Workers Made Easy]]
+
** [[SMILA/Documentation/WorkerManager|WorkerManager: Workers Made Easily]]
 
** [[SMILA/Documentation/HowTo/How_to_write_a_Worker|How to Write a Worker]]
 
** [[SMILA/Documentation/HowTo/How_to_write_a_Worker|How to Write a Worker]]
 
** [[SMILA/Documentation/TaskGenerators|Task Generators]]
 
** [[SMILA/Documentation/TaskGenerators|Task Generators]]
  
 
== Importing ==
 
== Importing ==
* [[SMILA/Documentation/Importing/Concept|Import Concepts]]
+
* [[SMILA/Documentation/Importing/Concept|Concepts, Workflow and Components]]
 +
** [[SMILA/Documentation/Importing/CompoundExtractorService|Compound Extractor Service]]
 
* Reference of Import Workers
 
* Reference of Import Workers
 
**[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/File | FileCrawler and FileFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]]
 
**[[SMILA/Documentation/Importing/Crawler/Web | WebCrawler and WebFetcher Worker]]
 +
**[[SMILA/Documentation/Importing/Crawler/JDBC | JdbcCrawler and JdbcFetcher Worker]]
 +
**[[SMILA/Documentation/Importing/Crawler/Feed | FeedCrawler Worker]]
 
**[[SMILA/Documentation/Importing/DeltaCheck | DeltaChecker Worker]]
 
**[[SMILA/Documentation/Importing/DeltaCheck | DeltaChecker Worker]]
 
**[[SMILA/Documentation/Importing/UpdatePusher | UpdatePusher Worker]]
 
**[[SMILA/Documentation/Importing/UpdatePusher | UpdatePusher Worker]]
 
* Developing new Import Workers
 
* Developing new Import Workers
** [[SMILA/Documentation/Importing/VisitedLinks | VisitedLinks service]]
+
** [[SMILA/Documentation/Importing/VisitedLinks | Using the VisitedLinks service]]
 
** [[SMILA/Documentation/Importing/Crawler/Web#Internal_structure|Extending the WebCrawler worker]]
 
** [[SMILA/Documentation/Importing/Crawler/Web#Internal_structure|Extending the WebCrawler worker]]
 
** [[SMILA/Documentation/HowTo/How to add a new Data Source to the importing framework|Adding a Data Source to the SMILA Import Framework]]
 
** [[SMILA/Documentation/HowTo/How to add a new Data Source to the importing framework|Adding a Data Source to the SMILA Import Framework]]
 +
* Additionally
 +
** [[SMILA/Documentation/Importing/RemoteCrawling | Remote Crawling]]
 +
** [[SMILA/Documentation/Importing/CrawlingMultipleStartURLs | Crawling multiple start URL in one job run]]
  
== The SMILA HTTP Server ==
+
== Embedded HTTP Server ==
 
* [[SMILA/Documentation/JettyHttpServer|Configuring Jetty]]
 
* [[SMILA/Documentation/JettyHttpServer|Configuring Jetty]]
 
* [[SMILA/Documentation/JettyHttpServer#JSON_Handlers|Developing JSON ReST Handlers for SMILA]]
 
* [[SMILA/Documentation/JettyHttpServer#JSON_Handlers|Developing JSON ReST Handlers for SMILA]]
Line 82: Line 93:
 
* [[SMILA/Documentation/Bundle_org.eclipse.smila.clusterconfig|ClusterConfig Service]]
 
* [[SMILA/Documentation/Bundle_org.eclipse.smila.clusterconfig|ClusterConfig Service]]
 
** [[SMILA/Documentation/Bundle_org.eclipse.smila.clusterconfig.simple|Simple Implementation]]
 
** [[SMILA/Documentation/Bundle_org.eclipse.smila.clusterconfig.simple|Simple Implementation]]
 +
* [[SMILA/Documentation/Bundle_org.eclipse.smila.zookeeper|Zookeeper Service]]
 
* [[SMILA/Documentation/ObjectStore/Bundle_org.eclipse.smila.objectstore|ObjectStore]]
 
* [[SMILA/Documentation/ObjectStore/Bundle_org.eclipse.smila.objectstore|ObjectStore]]
 
** [[SMILA/Documentation/ObjectStore/Bundle_org.eclipse.smila.objectstore.filesystem|Filesystem Objectstore Implementation]]
 
** [[SMILA/Documentation/ObjectStore/Bundle_org.eclipse.smila.objectstore.filesystem|Filesystem Objectstore Implementation]]
Line 90: Line 102:
 
* [[SMILA/Documentation/SesameOntologyManager|Ontology Processing with Sesame: Configuration and Pipelets]]
 
* [[SMILA/Documentation/SesameOntologyManager|Ontology Processing with Sesame: Configuration and Pipelets]]
 
* [[SMILA/Documentation/MimeTypeIdentifier|MimeTypeIdentifier]]
 
* [[SMILA/Documentation/MimeTypeIdentifier|MimeTypeIdentifier]]
 +
* [[SMILA/Documentation/ParameterDefinition|Description of Worker and Pipelet Parameters]]
 
* [[SMILA/Documentation/PublishingJAXWSWebservices|Publishing Web Services]]
 
* [[SMILA/Documentation/PublishingJAXWSWebservices|Publishing Web Services]]
 
* [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]]
 
* [[SMILA/Documentation/General JPA Configuration in SMILA|General JPA Configuration in SMILA]]
 +
* [[SMILA/Documentation/SMILA_Versioning|SMILA Version Information]]
  
 
== Deprecated Components ==
 
== Deprecated Components ==
* [[SMILA/Documentation/ConnectivityFramework|Connectivity Framework]]
+
 
**[[SMILA/Documentation/ConnectivityManager|ConnectivityManager]]
+
**[[SMILA/Documentation/DeltaIndexingManager|DeltaIndexingManager]]
+
***[[SMILA/Documentation/CrawlerController|CrawlerController]]
+
***[[SMILA/Documentation/Crawler|Crawler]]
+
****[[SMILA/Documentation/Filesystem Crawler|Filesystem Crawler]]
+
****[[SMILA/Documentation/Web Crawler|Web Crawler]]
+
****[[SMILA/Documentation/JDBC Crawler|JDBC Crawler]]
+
***[[SMILA/Documentation/AgentController|AgentController ]]
+
***[[SMILA/Documentation/Agent|Agent]]
+
****[[SMILA/Documentation/Mock Agent|Mock Agent]]
+
****[[SMILA/Documentation/Feed Agent|Feed Agent]]
+
****[[SMILA/Documentation/JobFile Agent|JobFile Agent]]
+
***[[SMILA/Documentation/CompoundManagement|CompoundManagement]]
+
** Development
+
*** [[SMILA/Documentation/HowTo/How_to_implement_a_crawler|How to implement a crawler]]
+
*** [[SMILA/Documentation/HowTo/How_to_implement_an_agent|How to implement an agent]]
+
 
** [[SMILA/Documentation/Management|JMX Management]]
 
** [[SMILA/Documentation/Management|JMX Management]]
 
*** [[SMILA/Documentation/Management#JMX_Client|JMX Clients]]
 
*** [[SMILA/Documentation/Management#JMX_Client|JMX Clients]]
 
** [[SMILA/Documentation/Record_Storage|RecordStorage]]
 
** [[SMILA/Documentation/Record_Storage|RecordStorage]]
 +
  
 
[[Category:SMILA]]
 
[[Category:SMILA]]

Revision as of 06:30, 19 July 2013

Basics

Development Environment

Pipelines and Pipelets: Synchronous Workflows

Searching

JobManager: Asynchronous Workflows

Importing

Embedded HTTP Server

Common Services

Deprecated Components