Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/ConnectivityManager"

(Overview)
Line 1: Line 1:
 
== Overview ==
 
== Overview ==
  
The Connectivity Manager is the single point of entry for data in the SMILA. It's functionality is devided into several Sub-Components for better modularization. The Connectivity Manager, and it's Sub-Components, are all implemented as Java OSGi services.
+
The Connectivity Manager is the single point of entry for data in the SMILA. It's functionality is divided into several Sub-Components for better modularization. The Connectivity Manager, and it's Sub-Components, are all implemented as Java OSGi services.
 
+
  
 
== API ==
 
== API ==

Revision as of 06:45, 9 April 2009

Overview

The Connectivity Manager is the single point of entry for data in the SMILA. It's functionality is divided into several Sub-Components for better modularization. The Connectivity Manager, and it's Sub-Components, are all implemented as Java OSGi services.

API

public interface ConnectivityManager {
 /**
   * Put the given records for further processing to the ADD Queue.
   * 
   * @param records
   *          a list of Record objects
   * 
   * @return the number of records successfully added to the ADD Queue
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  int add(Record[] records) throws ConnectivityException;
 
  /**
   * Put the the given ids for Deletion from the system to the DELETE Queue.
   * 
   * @param ids
   *          a list of IDs to delete
   * 
   * @return the number of ids successfully added to the DELETE Queue
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  int delete(Id[] ids) throws ConnectivityException;
 
  /**
   * Initializes a DeltaIndexing run for the given dataSourceId.
   * 
   * @param dataSourceId
   *          the ID of the data source
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  void initDeltaIndexingByDataSourceId(String dataSourceId) throws ConnectivityException;
 
  /**
   * Initializes a DeltaIndexing run for the given compoundId. This method is used by Agents only.
   * 
   * @param compoundId
   *          the Id of the compound object
   * 
   * @throws ConnectivityException
   *           if any error occurs if any error occurs
   */
  void initDeltaIndexingByCompoundId(Id compoundId) throws ConnectivityException;
 
  /**
   * Finishes a DeltaIndexing run for the given dataSourceId.
   * 
   * @param dataSourceId
   *          the ID of the data source
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  void finishDeltaIndexingByDataSourceId(String dataSourceId) throws ConnectivityException;
 
  /**
   * Finishes a DeltaIndexing run for the given compoundId. This method is used by Agents only.
   * 
   * @param compoundId
   *          the Id of the compound object
   * 
   * @throws ConnectivityException
   *           if any error occurs if any error occurs
   */
  void finishDeltaIndexingByCompoundId(Id compoundId) throws ConnectivityException;
 
  /**
   * Checks for each DeltaIndexing data, if it needs to be updated in the system. Updated means if the data is new or
   * has changed. The positions of the returned list of boolean values matches the incoming list of Records.
   * 
   * @param diData
   *          a list of Records containing DeltaIndexing data to check
   * 
   * @return a list of boolean values
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  boolean[] checkForUpdate(Record[] diData) throws ConnectivityException;
 
  /**
   * Initiates a delta delete for the given dataSourceID. All DeltaIndexing records belonging to the dataSourceID, that
   * were not marked as visited during this DeltaIndexing run are removed.
   * 
   * @param dataSourceId
   *          the ID of the data source
   * 
   * @return the number of records marked for delta-delete and successfully added to the delete Queue
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  int deleteDeltaByDataSourceId(String dataSourceId) throws ConnectivityException;
 
  /**
   * Initiates a delta delete for the given compound object ID and all it's elements. All DeltaIndexing records
   * belonging to this compound object, that were not marked as visited are removed. This method is used by Agents only.
   * 
   * @param compoundId
   *          the ID of the compound object
   * 
   * @return the number of records marked for delta-delete and successfully added to the delete Queue
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  int deleteDeltaByCompoundId(Id compoundId) throws ConnectivityException;
 
  /**
   * Administrative reset method. It removes all DeltaIndexing information.
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  void clearDeltaIndexing() throws ConnectivityException;
 
  /**
   * Administrative reset method. It removes all DeltaIndexing information for the given dataSourceID.
   * 
   * @param dataSourceId
   *          the ID of the data source
   * 
   * @throws ConnectivityException
   *           if any error occurs
   */
  void clearDeltaIndexingByDataSourceId(String dataSourceId) throws ConnectivityException;
}

Implementations

It is possible to provide different implementations for the ConnectivityManager interface. At the moment there is one implementation available.

org.eclipse.smila.connectivity.impl

This bundle contains the default implementation of the ConnectivityManager interface. The following methods are not implemented yet, as neither Agents nor CompoundManagement is implemented:

  • initDeltaIndexingByCompoundId(final Id compoundId)
  • finishDeltaIndexingByCompoundId(final Id compoundId)
  • deleteDeltaByCompoundId(final Id compoundId)

The ConnectivityManagerImpl contains the core execution logic as it does the actual processing of the incoming requests. Incoming Record objects are split into different parts:

  • metadata (record attributes) is stored via the BlackboardService in the RecordStorage
  • attachments are stored via the BlackboardService in the BinaryStorage
  • a message object is added to a Queue containing the record Id and optional any additional metadata

This chart shows the Connectivity Manager implementation, it's Sub-Components and the relationship to other components: ConnectivityManager.png


Sub-Components

Router

The Router routes messages to the Queue(s) according to it's configuration. See SMILA/Documentation/QueueWorker for more information.

Delta Indexing Manager

The Delta Indexing Manager stores information about last modification of each record and can determine if a record has changed since it's last processing. See SMILA/Documentation/DeltaIndexingManager for more information.

Buffer

Not yet implemented.


Configuration

There are no configuration options available for this bundle.


Workflow Example

As an example here is the complete workflow of how the ConnectivityManager's API is used by the CrawlerController:

  • the CrawlerController initializes an import by calling initDeltaIndexingByDataSourceId(String). Internally the Delta Indexing Manager resets all visited flags for the specified data source and locks that data source, so that no other process is allowed to initialize it again
  • for each Record (containing only ID and hash token) the CrawlerController receives by the Crawler, it asks the ConnectivityManager (which internally forewards this to the DeltaIndexing Manager) if it needs to be added/updated using methdo checkForUpdate(Record[])
    • false: the DeltaIndexingManager marks the Record as visited (this is done during the checkForUpdate(Record[]) call
    • true: the Records (now conatining all data including attributes and attachments) is added via method add(Record[])
      • attributes and attachments are stored in the Storages via the Blackboard
      • Buffer logic is executed (t.b.d.)
      • the Router creates a "add" message and sends it to the appropriate Queue
      • the record is marked as visited by the DeltaIndexing Manager
  • after the iteration has finished the CrawlerController executes deleteDeltaByDataSourceId(String) which is forewarded to the DeltaIndexing Manager. It determines all records that are not marked as visited and returns them to the ConnectivityManager.
    • For each record to be deleted ConnectivityManager calls method delete(Id[]) on itself.
      • Buffer logic is executed (t.b.d.)
      • the Router creates a "delete" message and sends it to the appropriate Queue
      • the record is removed in the DeltaIndexing Manager store

Back to the top