Jump to: navigation, search

COSMOS Design 197521

Design Discussion for 197521: component implementation - separating framework vs. extension code

Separation of Concerns for COSMOS DC framework

This design document addresses COSMOS Bugzilla enhancement request 197521.

Change History:

Joel Hawkins 9/05/2007 Initial version


Currently, the DC framework uses the java stack. Threading issues related to query response management delegated to response components. Proposed approach constructs call-graph trees during loading/compilation phase. Proper tree chosen by framework for execution. Root components responsible for dispatching to framework only. No knowledge required of other components (no more wiring within the component).

Implementation Stages and Corporate Use Cases

Initial implementation and conversion of existing components/assemblies for iteration 6.


The terminologies/acronyms below are commonly used throughout this document. The list below defines each term:

ASSEMBLY: A collection of components that are used by the framework to either route data for storage or perform queries on stored data.

ASSEMBLY DEFINITION: An XML document that describes an ASSEMBLY.

ASSEMBLY DESCRIPTOR: The in memory representation of an assembly declaration.

COMPONENT: A discrete element in an assembly. Components are external to the framework.

ASSEMBLY COMPILER: A utility class that converts an assembly descriptor into an executable entity.

CONTEXT: The runtime representation of an assembly.

COMPONENT: A discrete element in an assembly. Components are external to the framework. Components come in five flavors - Sources, Sinks, Transforms, Filters, and Responses.

RESPONSE COMPONENT: A tagging component that enables the framework to manage query response sets that have been filtered and/or transformed by the framework. The response component has very little processing content in terms of implementation, but plays a crucial role in enabling the framework to process query requests.

RUNTIME: The host environment for assemblies.

OPTIMIZATION PROXY: A dynamic component inserted by the framework to allow components to be bypassed in certain scenarios.

VECTOR PROXY: A dynamic component inserted by the framework to allow components that match in type but not in cardinality to interact. See LINK to 197525 for a discussion of how buffering is supported in the framework.

DYNAMIC PROGRAMMING: An optimization technique useful for computing optimal paths through acyclic directed graphs based on a 'cost' concept.

Use Cases

Direct component access.

Branched query

Branched collection

External Interfaces

Framework Interface

public abstract interface IDataCollectionContext {
	public static final String NAMESPACE = "http://www.eclipse.org/xmlns/cosmos/1.0/Context";

	public static final String COSMOS_NAMESPACE = "http://www.eclipse.org/xmlns/cosmos/1.0";
	public static QName CONTEXT_QNAME = new QName(COSMOS_NAMESPACE,"context");
	public static QName NAME_QNAME = new QName(COSMOS_NAMESPACE,"name");
	public static QName DIRECTION_QNAME = new QName(COSMOS_NAMESPACE,"direction");
	public static QName OPTIMIZABLE_QNAME = new QName(COSMOS_NAMESPACE,"optimizable");
	String getName();
	void materialize() throws Exception;

	boolean isQueryContext();

	boolean isCollectionContext();
	void flush();

	public IDataSourceService getDataSource();
	public IDataQueryService getDataQuery();
	void dispatch(Object obj, IDataSourceService dispatcher) throws RuntimeException;
	void dispatch(Object obj, String response, IDataQueryService dispatcher) throws RuntimeException;

	IDataQueryResult getQueryResult();

	boolean isSupportedQueryResponse(String response);
	void addQueryResponseType(String type);
	String[] getQueryResponseTypes();
	void addQueryResult(Object obj);

Component Interfaces

Intermediate component interfaces reduced to tagging interfaces.

public interface IDataFilterService {
	static QName FILTER_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"filter");
public interface IDataTransformService {
	public static QName TRANSFORM_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"transform");

Source, Query and Sink interfaces reduced to capability declarations and context access methods.

public interface IDataQueryService {
	public static QName QUERY_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"query");
	public static String PREFIX = "query";
	public static String NAMESPACE_URI = "http://cosmos.eclipse.org/capabilities/query";
	public static String SUPPORTED_DIALECTS_URI = NAMESPACE_URI+"/getSupportedDialects";
	public static String SUPPORTED_RESPONSES_URI = NAMESPACE_URI+"/getSupportedResponses";
	public static String SUPPORTED_QUERY_URI = NAMESPACE_URI+"/supportedQuery";
	public static String QUERY_URI = NAMESPACE_URI+"/query";
	public static String PAGE_QUERY_URI = NAMESPACE_URI+"/pageQuery";
    public static final QName DIALECTS_QNAME = 
        new QName(NAMESPACE_URI, "getSupportedDialects", PREFIX);

	String[] getSupportedDialects();
	String[] getSupportedResponses();
	boolean supportedQuery(String dialect, String response);	
	IDataQueryResult query(String dialect, String response, String queryString, String dataSource) throws Exception;
	IDataQueryResult pageQuery(String dialect, String response, String queryString, String dataSource, int max, int start) throws Exception;

	void setContext(IDataCollectionContext context);

public interface IDataSinkService {
	public static QName SINK_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"sink");
	static QName DATA_FLOW_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"dataflow");
	static QName DATA_SOURCE_TYPE_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"type");
	static QName DATA_KEY_SET_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"keyset");
	static QName DATA_KEY_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"key");
	public DimensionSet getDimensionSet(); //Needs revisting based on DataBroker interaction requirements
	public void setDataSet(DataSet ds); //Needs revisting based on DataBroker interaction requirements
public interface IDataSourceService {
	static QName SOURCE_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"source");
	static QName FACTORY_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"factory");
	static QName DATA_SOURCE_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"datasource");
	 * The following 6 methods are exposed as management capabilities 
	 * by the abstract implementation of this interface. See 
	 * org.eclipse.cosmos.dc.common.api.impl.AbstractDataSource
	boolean connect() throws Exception;
	boolean run() throws Exception;
	boolean disconnect() throws Exception;
	boolean cancel() throws Exception;
	boolean pause() throws Exception;
	boolean resume() throws Exception;
	DataSource getDataSource(); //Needs revisting based on DataBroker interaction requirements
	void setContext(IDataCollectionContext context);

Response service provides static information to framework to allow management of response collections, removing that responsibility from the component level.

public interface IDataResponseService {
	public static QName RESPONSE_QNAME = new QName(IDataCollectionContext.COSMOS_NAMESPACE,"response");
	public String[] getResponseTypes();
	public Class getClassForType(String type);
	void setContext(IDataCollectionContext context);


Naming conventions used by reflection during compilation.

"public void dispatch*(type)" for root components (query and source). - invoked by the component to dispatch data into the assembly. Introspected by the assembly to construct callgraphs.

"public type filter*(type)" for filter components

"public type transform*(type)" for transform components

"public void store*(type)" for sink components

"public void response*(type)" for response components

Methods must be declared by component class.

For array types and collections, the types are matched by the base type. Collections be declared using generics in order to support type-safe handling. See LINK TO 197525 for a discussion of how collections and arrays are handled by the framework.

Framework Implementation Details


  • Assemblies are singletons.
  • Components are thread safe.

Discussion of compilation steps:

  • Runtime loads assembly: parses assembly documentation.
  • Runtime verifies that assembly with same identifier is not active
  • Runtime materializes assembly
    • creates context instance for assembly
    • compiles context using ContextCompiler
  • Compiler creates component tree
  • Compiler extracts dispatch methods from root component
  • Compiler computes possible call graphs through assembly based on method signatures and conventions. Computation includes insertion of proxies where appropriate (either for vector considerations or optimzation possibilities).
  • If root is query, compiler duplicates root call graphs for each supported response type, adjusting costs of responses to allow optimisation step to compute best path for each supported response type.
  • Compiler computes optimum path for each root dispatch method, culling suboptimal paths using a top-down dynamic programming algorithm. Culling is performed across all child components, choosing only the best path.
  • If root is query, remove all duplicated paths that did not fathom.

Discussion of runtime steps


  • localMethod
  • localQueryResult
  • localQueryCollection
  • localQueryCollectionClass


  • ThreadLocal usage
    • localMethod
    • local QueryResult
    • localQueryCollection
    • localQueryCollectionClass
  • Response managment
    • Native response type
    • Assembled response types


  • ThreadLocal usage
    • localMethod
  • Sink management
  • Native sink
  • Assembled sink

General Depth-first traversal of call graph Dispatch to framework using context Abstract class wrappers


Declarative support for culling within component vs global culling across all components for non-query case. Management control over buffer size? Configurable optimization controls? (Vary costing of vectorization/optimzation proxy/component types) Debugging of compilation process Error reporting - operational status based on 'wiring'.

Sink methods - support for dataset registration... need to flesh out databroker interaction here (back to instance data again) Source methods - support for datasource registration... need to flesh out databroker interaction here (back to instance data again)