Skip to main content
Jump to: navigation, search


COSMOS Main Page >

Data Collection and Normalization

Scope and Mission

The Data Collection and Normalization component will implement a data collection framework as described on pages 20 and 21 of the COSMOS project creation review pdf here.

The initial scope of the Data Collection and Normalization component supports the Use Cases that the COSMOS project is defining to demonstrate the value of SML throughout the application lifecycle. It will accept managent data from the instrumentation produced by the COSMOS Build to manage component and provide data normalization and persistance and a query API to support its consumption by the COSMOS Data Reporting component.

Tactical Plan

Much of the initial work of the data collection component will be done in cooperation with the TPTP project. To facilitate the timely creation of a demonstrable implementation, the data collection component will leverage key TPTP technologies such as the Common Base Event (CBE) format, TPTP agents such as JMX and the Generic Log Adapter (GLA), and the Agent Controller.

See also the prototype proposal in the Resources section below.

Architectural Vision

Persistence API

The framework will provide an extension point for data persistance. Each supported data type will be consumed by a persistor that supports persisting a particular data format into its own database table or other data store.

Initially, persistors will be provided for the EMF models that TPTP currently supports.

Data Collection Control API

A WSDM interface will be provided for the purpose of configuring the data collection agents.

In the longer term, this interface will be usable to manage the monitored infrastructure components.

Adapters for Data Collection agents into Persistence API

The framework will provide an extension point for data collection agent adapters. One possible approach is to provide a service that connects agent adaptors to an appropriate data persistor based on the data type supported by the adaptor such as CBE or WEF. A provision to specify these connections declaratively is required, but they could also be constructed dynamically via a WSDM interface.

The connection service should specifically support the ability to inject interceptors between the agent adapters and the data persistors.

Initial adapters will include log, statistical, and perhaps trace adapters but the framework will be generalized so it can be extended to support any additional models as required.

Query API

The query API will provide a web service interface to the data store(s).

Its binding will be constructed in a manner analogous to the Data Collection adapters where extensions can be created to implement any desired query mechanism without requiring dependence on the type or location of the underlying data store.

Multiple web services will be provided to allow the consumer to select appropriate query semantics.

The initial targets are SQL and / or XPath.



Database Schema

Each data collection adapter will define the schema for its datastore. A reasonable minimum expectation is that each database will be keyed by timestamp and the unique ID of the managed resource.

The initial target is Derby, with support for MySQL and proprietary databases later. An attempt will be made to avoid proprietary SQL to permit a wide choice of database engines.

Proposed Technology Stack

This diagram shows one possible implementation technology stack for COSMOS data collection

The data collector will be extended with bundle fragments that implement data collection adapters, persistors and query interfaces.

Data Collection Adaptors adapt an information source into a common format supported by a persistor.

Persistors accept information of a particular type such as CBE and provide a persistance service for this data type.

Query Adaptors adapt queries in a language such as SQL or XPATH to the persistent data store.

WSDM (Muse)
SOAP (Axis)
Data Collection Bundle
DC Adaptor Persistor Query
OSGi (Equinox)

Common Agent Infrastructure

Discussion on Common Agent Infrastructure

  • No formal agent infrastructure currently in scope
    • Root level question: Do we want this in COSMOS?
  • COSMOS needs to have some of this to be self hosting
  • Tie in BtM

Open Issues

  • Which components require alternative implementations in support of embedded devices
  • Which components must support Java 1.4


TPTP model and data persistence (TPTP DMS CVS link)

Initial prototype proposal

DataCollector prototype

Persistence Frameworks Discussion

How to set up your Eclipse for working on DataCollection

Setting up a tomcat instance to working with the COSMOS End to End example

Running the Data Collection w/CBEs

Data Collection Q&A

Instructions for running the end-to-end sample

back to home

Design Documents & Discussions In Progress

DC Design Improvements

Discussions around the Data Manager Design

193420: Create simple stand alone example of the data collection framework 193420

197867: Need to design and implement the COSMOS DC Data Broker 197867

197868: Need to design and implement the COSMOS DC Management Domain 197868

197869: Need to design and implement the COSMOS DC Service Broker 197869

197870: Need to design and implement the COSMOS DC Client APIs 197870

197521: Component implementation - separating framework vs. extension code 197521

197525: Buffering data in the data assembly pipeline 197521

197833: Need an XML schema for the assembly XML 197833

205658: Security Framework/Muse Security Integration 205658

Back to the top