Jump to: navigation, search

Difference between revisions of "SMILA/Glossary"

(more job management terms)
Line 1: Line 1:
 +
 
__NOTOC__
 
__NOTOC__
  
Line 45: Line 46:
 
* '''[http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsbpel BPEL]''' - BPEL is an XML-based language defining several constructs to write business processes. It defines a set of basic control structures like conditions or loops as well as elements to invoke web services and receive messages from services. It relies on [[#W|WSDL]] to express web services interfaces. Message structures can be manipulated, assigning parts or the whole of them to variables that can in turn be used to send other messages.
 
* '''[http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsbpel BPEL]''' - BPEL is an XML-based language defining several constructs to write business processes. It defines a set of basic control structures like conditions or loops as well as elements to invoke web services and receive messages from services. It relies on [[#W|WSDL]] to express web services interfaces. Message structures can be manipulated, assigning parts or the whole of them to variables that can in turn be used to send other messages.
  
* '''Bucket''' - Data container in an asynchronous workflow, containing logically grouped data objects. Can be ''transient'' for interim data, which means that data is not persisted and removal of data is under job manager control, or ''persistent'', which means that removal of data is not under job manager control.
+
* '''Bucket''' - Data container in an asynchronous workflow, containing logically grouped [[#D|data objects]]. Can be ''transient'' for interim data, which means that data is not persisted and removal of data is under job management control, or ''persistent'', which means that removal of data is not under job management control.
  
* '''Bulk''' - a Data Object containing a sequence of [[#R|Records]]
+
* '''Bulk''' - a [[#D|Data Object]] containing a sequence of [[#R|Records]]
  
 
== C ==
 
== C ==
Line 57: Line 58:
 
== D ==
 
== D ==
  
* '''Data Object''' - The smallest unit of data handled by a workflow (e.g. a bulk).
+
* '''Data Object''' - The smallest unit of data handled by an asychronous workflow (e.g. a [[#B|bulk]]).
  
 
* '''Delta indexing''' - Delta indexing is also known as incremental or generation based indexing.
 
* '''Delta indexing''' - Delta indexing is also known as incremental or generation based indexing.
Line 95: Line 96:
 
* '''[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/ JMX]''' - Java Management Extension is a specification to administrating and monitoring java applications.
 
* '''[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/ JMX]''' - Java Management Extension is a specification to administrating and monitoring java applications.
  
* '''Job''' - A Job combines several Worker operations in one logical block by referencing an asynchronous workflow.
+
* '''Job''' - A Job combines several [[#W|Worker]] operations in one logical block by referencing an asynchronous workflow.
  
 
* '''Job Run''' - A Job Run is an "instance" of a Job, for example one run of an import of a data source to an index.
 
* '''Job Run''' - A Job Run is an "instance" of a Job, for example one run of an import of a data source to an index.
Line 129: Line 130:
  
 
* '''[http://www.osoa.org/display/Main/Service+Data+Objects+Home SDO]''' - Service Data Objects are designed to simplify and unify the way in which applications handle data. Using SDO, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, web services, and enterprise information systems. The SDO programming model is language neutral.
 
* '''[http://www.osoa.org/display/Main/Service+Data+Objects+Home SDO]''' - Service Data Objects are designed to simplify and unify the way in which applications handle data. Using SDO, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, web services, and enterprise information systems. The SDO programming model is language neutral.
 +
 +
* '''Slot''' - A slot is a placeholder for input/output [[#B|buckets]] of a [[#W|worker]].
  
 
* '''SNMP''' - Simple Network Management Protocol is a network protocol which controls the communication between supervised devices and the monitoring application (e.g. [[#J|JMX]]).
 
* '''SNMP''' - Simple Network Management Protocol is a network protocol which controls the communication between supervised devices and the monitoring application (e.g. [[#J|JMX]]).
Line 139: Line 142:
  
 
== T ==
 
== T ==
 +
 +
* '''Task''' - Description of a single unit of work to be processed by a [[#W|Worker]]. A task can contain worker specific properties.
  
 
* '''[http://incubator.apache.org/tuscany/sca-overview.html Tuscany]''' - Apache Tuscany is an implementation of the [[#S|SCA]] specification 1.0. It is available for Java and C++. It also supports [[#S|SDO]] specification 2.1 for both Java and C++. Go to [[SMILA/Project Related Technologies/SCA and Tuscany|SCA and Tuscany]]  for discussing.
 
* '''[http://incubator.apache.org/tuscany/sca-overview.html Tuscany]''' - Apache Tuscany is an implementation of the [[#S|SCA]] specification 1.0. It is available for Java and C++. It also supports [[#S|SDO]] specification 2.1 for both Java and C++. Go to [[SMILA/Project Related Technologies/SCA and Tuscany|SCA and Tuscany]]  for discussing.
Line 149: Line 154:
 
== W ==
 
== W ==
  
* '''Workflow''' - see [[#P|pipeline]]  
+
* '''Worker''' - Single processing component in an asychnrous workflow. Pulls Tasks to process. Defines input/output data in a worker description.
 +
 
 +
* '''Worker Description''' - Description of a [[#W|worker's]] input/output behaviour. Contains needed input data and parameters, as well as generated output data.
 +
 
 +
* '''Workflow (asynchronous)''' - Combines several [[#W|Worker]] steps by defining preconditions for generating a Worker task, e.g. needed input data. Also defines generated output data.
 +
 
 +
* '''Workflow (synchronous/BPEL)''' - see [[#P|pipeline]]  
 +
 
 +
* '''Workflow run''' - Single traversal of a workflow.
  
 
* '''[http://www.w3.org/TR/wsdl WSDL]''' - WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.
 
* '''[http://www.w3.org/TR/wsdl WSDL]''' - WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.

Revision as of 10:31, 11 July 2011


A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


A

  • Agent - An agent is a component of the connectivity framework that monitors a data source for changes (or reacts to events). If a change occurs (e.g. objects are created, deleted, or changed) it immediately creates a record out of the object and sends it to SMILA. Agents are used to watch data sources for modifications, not for bulk import.
  • Aperture - Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (e.g. file systems, web sites, mail boxes) and the file formats (e.g. documents, images) occurring in these systems.
  • Attachment - Attachments are parts of records used to store large binary data such as document content.
  • Attribute - Attributes are parts of records and contain simple data objects that are easily represented in XML, such as String, Integer, Float, and Date.

B

  • Blackboard or blackboard service - The blackboard service manages SMILA records during processing in a SMILA component (connectivity, workflow processor). In addition it hides the handling of record persistence from these components. For a complete description see Usage of Blackboard Service.
  • BPEL - BPEL is an XML-based language defining several constructs to write business processes. It defines a set of basic control structures like conditions or loops as well as elements to invoke web services and receive messages from services. It relies on WSDL to express web services interfaces. Message structures can be manipulated, assigning parts or the whole of them to variables that can in turn be used to send other messages.
  • Bucket - Data container in an asynchronous workflow, containing logically grouped data objects. Can be transient for interim data, which means that data is not persisted and removal of data is under job management control, or persistent, which means that removal of data is not under job management control.

C

  • Connectivity framework
  • Crawler - A crawler is a component of the connectivity framework that iterates over the elements (e.g. files) of a data source, creates records for all elements and sends them to SMILA (e.g. FileSystemCrawler or WebCrawler). In general crawlers are used for initial (bulk) import of data sources.

D

  • Data Object - The smallest unit of data handled by an asychronous workflow (e.g. a bulk).
  • Delta indexing - Delta indexing is also known as incremental or generation based indexing.
  • DFP - The Data Flow Process is a set of processing steps. These steps cover the following aspects and is described in the data flow process description:
    • Storage descriptions
    • Extraction of messages from the queue
    • Process based information handling (e.g. splitting, routing, ...)
    • Data annotation through BPEL
  • DFPD - The Data Flow Process Description is a set of process related configuration files. Files in this set are optional. The following components are contained in the DFPD:
    • Source/target references (e.g. queue)
    • References to different storages or collections
    • BPEL (change and delete process in several files organized in system/data processes)

E

  • Eclipse - Eclipse is an open source community, whose projects are focused on building an open development platform comprised of extensible frameworks, tools and runtimes for building, deploying and managing software across the lifecycle.
  • EILF - EILF (Enterprise Information Logistics Framework) was the original proposed name of SMILA. Since this abbreviation was difficult to pronounce, it was not accepted by the community and thus changed to SMILA.
  • Equinox - Equinox is a base technology from Eclipse implementing the OSGi specification. Not only delivering a high performance class loading mechanism Equinox also provides an environment for managing component dependencies.

F

G

H

I

  • ID - An ID identifies a record in SMILA and is part of a record. IDs are complex objects, aggregated of various keys (data source IDs, object IDs within the data source, element and/or fragment IDs).
  • IRM - Abbreviation of Information Reference Model

J

  • JMS - Java Message Service is the sun API to exchange messages between two or more clients. To use JMS a JMS-Provider (like Apache ActiveMQ) is required.
  • JMX - Java Management Extension is a specification to administrating and monitoring java applications.
  • Job - A Job combines several Worker operations in one logical block by referencing an asynchronous workflow.
  • Job Run - A Job Run is an "instance" of a Job, for example one run of an import of a data source to an index.

K

L

M

N

O

  • ODE - Apache ODE (Orchestration Director Engine) executes business processes following the BPEL/WS-BPEL standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.
  • OSGi - The OSGi specification is about managing a component based software system. It defines an in-VM Service Oriented Architecture (SOA) for networked systems. An OSGi Service Platform provides a standardized, component-oriented computing environment for cooperating networked services. This architecture significantly reduces the overall complexity of building, maintaining, and deploying applications.

P

  • Pipelet - A pipelet is a reusable component (POJO) in a BPEL workflow used to process data contained in records. See Pipelets for details.
  • Pipeline - A pipeline is the definition of a BPEL process (or workflow) that orchestrates pipelets and other BPEL services (e.g. web services).

Q

R

S

  • SCA - Service Component Architecture is a set of specifications which describe a model for building applications and systems using a service-oriented architecture. SCA extends and complements prior approaches to implementing services, and SCA builds on open standards such as web services. The SCA programming model is highly extensible and is language-neutral. Go to SCA and Tuscany for discussing.
  • SDO - Service Data Objects are designed to simplify and unify the way in which applications handle data. Using SDO, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, web services, and enterprise information systems. The SDO programming model is language neutral.
  • Slot - A slot is a placeholder for input/output buckets of a worker.
  • SNMP - Simple Network Management Protocol is a network protocol which controls the communication between supervised devices and the monitoring application (e.g. JMX).
  • SOA - Service-Oriented Architecture is a computer systems architectural style for creating and using business processes, packaged as services, throughout their lifecycle. SOA also defines and provisions the IT infrastructure to allow different applications to exchange data and participate in business processes. These functions are loosely coupled with the operating systems and programming languages underlying the applications.
  • Surrogate process - A surrogate process is a process that embeds several components. Additionally this process adds further functionality to these components (e.g. runtime functionality, error prevention, transactions, manageability ...). In the SMILA application surrogate processes also add business processes and further features (e.g. callability from external processes or applications...).
  • STP - SOA Tools Platform is an Eclipse open source project that builds frameworks and exemplary extensible tools that enable the design, configuration, assembly, deployment, monitoring, and management of software designed around a Service-Oriented Architecture (SOA). An interesting subproject is the SCA Composite Designer.

T

  • Task - Description of a single unit of work to be processed by a Worker. A task can contain worker specific properties.
  • Tuscany - Apache Tuscany is an implementation of the SCA specification 1.0. It is available for Java and C++. It also supports SDO specification 2.1 for both Java and C++. Go to SCA and Tuscany for discussing.

U

V

W

  • Worker - Single processing component in an asychnrous workflow. Pulls Tasks to process. Defines input/output data in a worker description.
  • Worker Description - Description of a worker's input/output behaviour. Contains needed input data and parameters, as well as generated output data.
  • Workflow (asynchronous) - Combines several Worker steps by defining preconditions for generating a Worker task, e.g. needed input data. Also defines generated output data.
  • Workflow (synchronous/BPEL) - see pipeline
  • Workflow run - Single traversal of a workflow.
  • WSDL - WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.

X

Y

Z