Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SDMX Example

Back to 197867 Back to Talk for 197867


An example of using SDMX

 <dataSet start="00:00:00" end="01:30:00">

  <dataFlow id="statistical data for intranet web server">

    <keyFamily id="statisticalData">
      <dimension id="CurrentBusyThreads"/>
      <dimension id="CurrentThreadCount"/>
      <dimension id="HeapMemoryUsed"/>
      <dimension id="NonHeapMemoryUsed"/>
    </keyFamily>

    <dataSource id="intranet web server">
      <dataSourceType id="Tomcat Server">
    </dataSource>

  </dataFlow>
  
  <observations keyFamily="statisticalData">
    <observation captureTime="00:00:10">
      <key dimension="CurrentBusyThreads" value="1"/>
      <key dimension="CurrentThreadCount" value="25"/>
      <key dimension="HeapMemoryUsed" value="19711768"/>
      <key dimension="NonHeapMemoryUsed" value="30862992"/>
    </observation>
    <observation captureTime="00:00:15">
      <key dimension="CurrentBusyThreads" value="1"/>
      <key dimension="CurrentThreadCount" value="25"/>
      <key dimension="HeapMemoryUsed" value="19711768"/>
      <key dimension="NonHeapMemoryUsed" value="30862992"/>
    </observation>
  </observations>
  
 </dataSet>


Metadata that can be part of the broker registry:

  <dataFlow id="statistical data for intranet web server">

    <keyFamily id="statisticalData">
      <dimension id="CurrentBusyThreads"/>
      <dimension id="CurrentThreadCount"/>
      <dimension id="HeapMemoryUsed"/>
      <dimension id="NonHeapMemoryUsed"/>
    </keyFamily>

    <dataSource id="intranet web server">
      <dataSourceType id="Tomcat Server">
    </dataSource>

  </dataFlow>

We can associate an EPR with a dataFlow.

  <dataFlow id="statistical data for intranet web server">
    <epr>...</epr>
    <keyFamily>...</keyFamily>
    <dataSource>...</dataSource>
  </dataFlow>

Data sets and observations of each data set can be retrieve from data managers via query API.

Jimmy's comments on the SDMX example

In your SDMX example, you refer to the <keyFamily id="statisticalData">..

Comment 1: This keyFamily should live in the Data Manager. This means that I am questioning your follow on section that states "Metadata that can be part of the broker registry". Why do you want to keep the Metadata in the Broker? Will this not DRASTICALLY increase the traffic between the Broker and the plethora of Data Manager which may come and go often? What value can we derive by keeping the metadata in the Broker?

Comment 2: Hubert, above you state that "The SDMX concepts can be hidden behind some easy-to-use APIs". Given that we may have such API's, this reduces the need even FURTHER for us to have the metadata live in the Broker.

Comment 3: The observation, i.e. the data obviously is ALSO NOT stored in the Broker...

SML & SDMX

Valentina's comments

This is a set of conclusions I came to while investigating the usage of the SDMX specifications for registering data types with the COSMOS Data Broker. As part of my initial investigation, I also looked into how SML could be used for registering data sets with the Data Broker; I will touch on this in the current section.

SDMX usage with the Data Broker

  • After reading the SDMX specification and investigating SDMX samples and best practices available on the web I conclude that SDMX is meant to be used for exchanging statistical data and any other usage of it is accidental and not intended from a spec definition
  • If the only purpose of using SDMX with the Data Broker is to make use of the keysets support and data-type separation, than we just complicate unnecessarily the entire usecase. There are other ways, much cleaner and easier to get to the same support ( such as pure xml .. )

SML and SDMX

Cosmos builds a system management scenario aligned around the SML Data Center repository. This SML repository contains data types definitions for resources used in a system management solution. The repository also contains a set of instances built using these types. The data definition is clearly separated from the actual instances. During the last f2f meeting, we had decided to build a generic set of SML types that can be used to build any resource ( asset or any other type of system management entity ). The repository offered a predefined, finite subset of resource types the COSMOS framework will support in data collection and web visualization; some of the supported types were machines, applications server, os( see COSMOS Data Center plugin for the entire content). The COSMOS SML repository types( Data Center repository type) is meant to be a precursor of the CML library and once this will be made available, the Data Center will be moved to use the CML types and structures.

I would incline to use the types defined in the Data Center repository for the Data Broker type registration support. Data sets can be defined using XML, or SML if inter-document references are required. Data type registration will be using the SML Data Center types. The benefits:

  • in the Data Center repository, data types(or definitions) are separated from the actual instances so we can use the types to define the data to be interchanged with the Data Broker.
  • it aligns with the COSMOS scope of supporting SML and CML based resources
  • the Data Center repository types are SML enabled and will iteratively became CML types; the Data Broker will eventually be based on CML and align with the rest of the COSMOS components while doing that ( CML will be lingua franca for describing system resources )

Notes from conversation on 20-Aug-07

  • The amount of metadata the broker maintains should be minimal
  • Two concerns w/SDMX format
    • Changes over time
    • Amount of data--(lots of it)
  • Concerned about storing the "dimension" stuff inside the broker
  • Can we start with simply the classification of the data and the type of the resource?
    • This would translate to the data source id and the data family id
    • Can we align this up with SML concepts
  • Need to walk through query examples
    • Martin's subnet query
    • Statistical data & config data

Back to the top