Jump to: navigation, search

EclipseSCADA/Plan/Haystack

This is the planning document for Eclipse SCADA Haystack, a big data value archive.

This document will change during the planning and evaluation phase. We are happy to receive your input and thoughts.

Draf proposal on Google Drive

Background

The Eclipse SCADA system can store analog values in a value data archive. As a storage is used a plain and simple file system structure named HDS. With this it is possible to store analog values in raw format (including sub-second timestamp resolution) for several years. It provides quick access for standard chart queries and only has a modest storage use (1 item, 1 month, one value every 250ms -> ~120MB). Due to its simple file structure, the HDS archive data can easily be replicated using simple rsync jobs (or other remove file copy tools), to some sort of slave instance which aggregates the data of multiple other value archives. This includes buffering on the local node as long as the connection to the global node is not available.

However the system will not scale beyond one node for an aggregated archive.

Purpose

The purpose of Eclipse SCADA Haystack is to provide a value data archive for analog values which can scale with the numbers of nodes added. The archive should be an aggregated archive, which gathers data from a distributed set of nodes and stores it persistently in a cluster.

Requirements

Technical

  • Scale with the size of the cluster
  • Provide data redundancy inside the cluster

Figures

  • The archive should be able to handle 10.000.000 items with an insertion rate of 1.000.000 insertions per second.

Configuration

  • The collector nodes and items should be configured on the server, not on the collector node. The collector node only contacts the cluster and fetches the newest configuration, providing the required values.

Data

  • Store analog floating point values (Java "double")
  • The naming of the data stores (items) should not be restricted
  • Store values with sub-second resolution
  • Provide a way to detect stale value sources (timeout)
  • Insertion operations should be idempotent

Interface

  • Provide a simple HTTP/HTTPS based interface for provided data
  • Check out other systems and provide adapters for importing their data
    • collectd?
    • others?
  • Provide an OpenTSDB collector interface

Limitations

Append only

The archive is designed to receive a stream of values in chronological order. This means that for one value source all items must be provided by the collector node with ascending timestamps. Unless state otherwise, data cannot be inserted at a random point in time.

Architecture & Concepts

The idea at the moment is to use Hadoop and HBase for creating the storage cluster.

The collector nodes push the data to import nodes using HTTP. Collector nodes can be located anywhere. Import nodes must be accessible for the collector nodes using HTTP, but can be behind some load balancer. Each import node should be able to import data for any collector node.

Once the import node has completed importing the data, it returns the result with the HTTP reply.

Query functionality will process query requests aside from the import nodes by directly accessing the HBase cluster. At the moment there is no plan for specific server applications beside the HTTP daemon on the import node. This reduces the need for a specific query node <-> client protocol. Eclipse SCADA has already such a protocol for its use case. However this is not a limitation, a specific query server with remoting can be added when needed.

Row key schema

Internal Storage

  • The data of all sources is stored in the same table
  • Each value is inserted independently at first
  • Then a second table will store compacted data frames
  • Compaction is triggered periodically

Compaction

  • Data is compacted by a time range of 2^16 ms -> about 65seconds
  • The compaction has a "high watermark" timestamp which indicates up to with time frame the data is compacted
  • Queries need to load from the compacted table first, when the "high watermark" is reached the remaining data must be read from the raw table
  • Compaction writes the data first and then the "high watermark", including the latest value
  • Queries read the "hight watermark" first, then read the data
  • Uncompacted data needs to the kept for a specific amount of time, for queries that still need to read raw, uncompacted data.

See also

The following projects provide a similar functionality.

OpenTSDB

This one comes very close to what we actually want. But it has some problems with our requirements.

  • No sub-second time resolution
  • Naming of the data sources does not seem to be free
  • No configuration way for the server
  • License is LGPL, which makes is problematic for Eclipse
  • It seems to require GNU plot?
  • Collector does not buffer when network goes down

The following features are interesting to us:

  • Cluster approach
  • Query cache
  • Server console for manual interaction

Kairosdb

https://code.google.com/p/kairosdb/

This one re-implemented the idea of OpenTSDB as well, due to the same issues issues we had. But they are using Cassandra as a back-end for sub-second time resolution. And our first tests with performance seem to be better than using kairos and Cassandra.

Glossary

node
A single computer, physical or virtual
collector, collector node
A node or service collecting data
item, source
A source that provides data exactly for one measure point