- 1 Data Collection and Normalization
- 1.1 Scope and Mission
- 1.2 Tactical Plan
- 1.3 Architectural Vision
- 1.4 Common Agent Infrastructure
- 1.5 Open Issues
- 1.6 Resources
- 1.7 Design Documents & Discussions In Progress
Data Collection and Normalization
Scope and Mission
The Data Collection and Normalization component will implement a data collection framework as described on pages 20 and 21 of the COSMOS project creation review pdf here.
The initial scope of the Data Collection and Normalization component supports the Use Cases that the COSMOS project is defining to demonstrate the value of SML throughout the application lifecycle. It will accept managent data from the instrumentation produced by the COSMOS Build to manage component and provide data normalization and persistance and a query API to support its consumption by the COSMOS Data Reporting component.
Much of the initial work of the data collection component will be done in cooperation with the TPTP project. To facilitate the timely creation of a demonstrable implementation, the data collection component will leverage key TPTP technologies such as the Common Base Event (CBE) format, TPTP agents such as JMX and the Generic Log Adapter (GLA), and the Agent Controller.
See also the prototype proposal in the Resources section below.
The framework will provide an extension point for data persistance. Each supported data type will be consumed by a persistor that supports persisting a particular data format into its own database table or other data store.
Initially, persistors will be provided for the EMF models that TPTP currently supports.
Data Collection Control API
A WSDM interface will be provided for the purpose of configuring the data collection agents.
In the longer term, this interface will be usable to manage the monitored infrastructure components.
Adapters for Data Collection agents into Persistence API
The framework will provide an extension point for data collection agent adapters. One possible approach is to provide a service that connects agent adaptors to an appropriate data persistor based on the data type supported by the adaptor such as CBE or WEF. A provision to specify these connections declaratively is required, but they could also be constructed dynamically via a WSDM interface.
The connection service should specifically support the ability to inject interceptors between the agent adapters and the data persistors.
Initial adapters will include log, statistical, and perhaps trace adapters but the framework will be generalized so it can be extended to support any additional models as required.
The query API will provide a web service interface to the data store(s).
Its binding will be constructed in a manner analogous to the Data Collection adapters where extensions can be created to implement any desired query mechanism without requiring dependence on the type or location of the underlying data store.
Multiple web services will be provided to allow the consumer to select appropriate query semantics.
The initial targets are SQL and / or XPath.
Each data collection adapter will define the schema for its datastore. A reasonable minimum expectation is that each database will be keyed by timestamp and the unique ID of the managed resource.
The initial target is Derby, with support for MySQL and proprietary databases later. An attempt will be made to avoid proprietary SQL to permit a wide choice of database engines.
Proposed Technology Stack
This diagram shows one possible implementation technology stack for COSMOS data collection
The data collector will be extended with bundle fragments that implement data collection adapters, persistors and query interfaces.
Data Collection Adaptors adapt an information source into a common format supported by a persistor.
Persistors accept information of a particular type such as CBE and provide a persistance service for this data type.
Query Adaptors adapt queries in a language such as SQL or XPATH to the persistent data store.
|Data Collection Bundle|
Common Agent Infrastructure
Discussion on Common Agent Infrastructure
- No formal agent infrastructure currently in scope
- Root level question: Do we want this in COSMOS?
- COSMOS needs to have some of this to be self hosting
- Tie in BtM
- Which components require alternative implementations in support of embedded devices
- Which components must support Java 1.4
TPTP model and data persistence (TPTP DMS CVS link)
Initial prototype proposal
Persistence Frameworks Discussion
How to set up your Eclipse for working on DataCollection
Setting up a tomcat instance to working with the COSMOS End to End example
Running the Data Collection w/CBEs
Instructions for running the end-to-end sample
Design Documents & Discussions In Progress
Discussions around the Data Manager Design
193420: Create simple stand alone example of the data collection framework 193420
197867: Need to design and implement the COSMOS DC Data Broker 197867
197868: Need to design and implement the COSMOS DC Management Domain 197868
197869: Need to design and implement the COSMOS DC Service Broker 197869
197870: Need to design and implement the COSMOS DC Client APIs 197870
197521: Component implementation - separating framework vs. extension code 197521
197525: Buffering data in the data assembly pipeline 197521
197833: Need an XML schema for the assembly XML 197833