The Machine to Machine(M2M) space combines systems from the embedded space and the enterprise/cloud space. This combination introduces problems that must be addressed by an M2M system provider that do not traditionally exist in either the embedded space or the enterprise/cloud space. This page is meant to catalog these M2M specific issues. It is intended that these problems will reference M2M Scenario Descriptions which provide the system context. This catalog is meant to serve as an educational tool for developers and architects who are new to M2M and as a reference during development of M2M technologies at Eclipse.
M2M Problem Description Template
Please refer to this template documentation.
M2M Problem Descriptions
M2M-Problem-0001 - Sharing of information for properly interpreting data values
Consider a sensor that provides the refrigerator temperature as a floating point number. In an M2M system there will be many potential consumers of this sensor reading. Some consumers may be on the embedded platform itself, others may execute in the enterprise or cloud. There is also a producer for transforming this number into a logical format(3.7 degrees C) instead of its protocol wire format. There is additional information that is required in order to properly make use of this floating point number that represents Refrigerator Temperature. Examples of this additional information from the producer include the data type(floating point), the units of the number(degrees Celsius), data range minimum and maximum values, and values used to represent sensor errors or missing sensors. Consumers may have their own additional information about this particular sensor reading, such as warning and error thresholds for alarm notification and text to display the units and sensor name in various human readable languages.
In order for producers and consumers to collaborate effectively, this additional information about sensors and actuators needs to be shared between the interested systems.
Fielded solution patterns
There have been many different ways used to solve this problem. Some solutions rely on the existence of a discovery service, others rely on a priori knowledge and others on always sending everything they know.
There are a couple of solutions to sharing that avoid the need for a discovery mechanism:
- Modbus - Distributed nodes communicate by reading and writing registers across a modbus network. The information about each register must be known at application design time. No mechanisms are provided for discovering the information required to interpret the bits in a register at runtime.
- Continua - The Continua Alliance was very aware of the issue when designing their protocols for healthcare data interchange of sensor data. Continua compliant systems send all of the additional information that they have about a sensor reading along with the data value or set of values. This sidesteps the need for discovery of the additional information, as it is always available. This constant availability comes at the cost of significantly increased data transmission sizes.
- SAE J1939 – The Society of Automotive Engineers created a standard named J1939. This standard describes many of the additional bits of information for a data value, such as units and data ranges. Producers and consumers can be written against this standard and understand how to interpret the data from these sources without explicit communication of additional information. In this way, SAE J1939 makes several improvements beyond Modbus, but is not sufficient if you wish to share properties that are not covered in the specification, such as translations for sensor names, application specific error notification limits, etc.
Referenced by problems
M2M-Problem-0002 - Sensor and Actuator Identity and Addressability
In order to make something addressable in any system, we need at least one unambiguous identifier or set of identifiers that help us uniquely identify that thing. In many systems the notions of identity and addressability are tightly coupled. In traditional enterprise systems the responsibility for determining unique identity is placed with the database. Sometimes this takes the form of auto-incrementing keys in a table, other times it is handled in a service that fronts the database storage. Consider the example of a work item system. The database may be configured to assign incrementing numbers for each new work item. When other programs need to address the work item, they can use the shorthand of this work item number. Having enterprise side M2M code use enterprise identifiers for addressability makes sense. However, it is often not possible to make the embedded systems aware of the enterprise identifiers for the sensors and actuators managed by an embedded system. This requires that we execute mapping operations between the embedded side and the enterprise side, where identity is mapped in order to ensure that sensor readings consistently flow to the right location on the enterprise and so that actions are performed from the enterprise side are executed on the proper actuator.
Let's consider the example of a store with multiple refrigeration controllers, each controlling multiple freezer units. In this case, we need to map the enterprise assigned identifier to the store id, the store id to the associated refrigeration controller unit id and then on to the specific freezer number. Let's assume an enterprise assigned id of 66.
[enterprise assigned identifier] -> [store_id, refrigeration_unit_controller_id, freezer_id] 66 -> [Store_67890, Controller_12346, Freezer_1]
There are many ways that the mapping operation can be performed. A couple concrete examples include:
If you want to perform the mapping operation on the enterprise side, then all of the routing information needs to be available to the enterprise when processing incoming sensor values and when executing outgoing commands. This information needs to stay in sync with changes in device configurations. For example, you install a replacement refrigeration unit or refrigeration unit controller in a store for maintenance reasons.
If you want to perform the mapping closer to the control bus, then you need to make one of the embedded computers aware of the enterprise identifiers and possibly have the ability to ask for the creation of new enterprise identifiers (in the case of replacing/upgrading sensors).
These issues have ramifications on topic space design in pub/sub system interfaces, as well as in uri and resource design in restful system interfaces.
Fielded solution patterns
- Manual - Most existing systems rely on static configurations. Often this configuration information is gathered via manual data entry for initial configuration and equipment replacement scenarios.
Referenced by problems
M2M-Problem-0003 - “One to One” data model for M2M Systems
Scenario 003 is summarizes a common charateistic of solutions that integrate physical world systems with IT, Web, and other connected applications. One large class of such M2M soultions is SCADA (Supervisory Control and Data Acquisition) and Telemetry systems. They probably represent one of the single largest markets of existing sensors, actuators and smart devices that the M2M and IoT markets are starting to target. In this context, one of the greatest problems to overcome is the “SCADA Data Cage” scenario where “Device” applications are coded to talk directly to corresponding “Host” applications. The result is that what could be a wealth of device information shared across an enterprise, becomes captured in the host application and becomes difficult if not impossible to extract, share, and derive business value and analytics from. In large part this is due to legacy protocols that have been developed over the years as needed to integrate an a vast array of types of remote devices installed and ntegrated in the field. In fact protocols in this context are the single biggest issue when used in any SCADA/Telemetry/M2M infrastructure. From a legacy point of view device protocols typically encapsulated both the “transport” of the data as well and the “data representation”. With this being the case you cannot break up the transport of the data from the data itself. The maturity and high business value of these SCADA system is represtative of the types of "coupling" problems that need to be solved as the use of coneected devices increases.
M2M-Problem-0001 touched on this issue with a few examples of :
*Modbus *SAE J1939 *Continua Medical Device Profiles
But in fact there are literally hundreds of these types of protocols in use today, that will in use for many years to come. The Problem and the related Scenario are intended to be representative of a larger class of other M2M and IoT solutions where device data and protocols can become too tightly coupled to the applications.
Take for example the Modbus Protocol. Modbus is probably one of the most popular and widely used device protocols in the industrial market today. It is a Poll/Response protocol that strictly defines unit addressing, function code definitions, and data representation within each of the Poll or Response data packets. Initially designed primarily for RS-232/485 Poll/Response networks it has served its purpose well. But consider the fact that now we would like to take the data from a Modbus response and share it with multiple data consumers, quickly and easily. Since by its very nature Modbus requires that a Host application formulate and send a proper poll message and then receive, validate, and parse the resulting data, this very process now relegates the data to the host application that received it. To share it with any additional data consumers that want access to the parameters requires additional coding. This problem holds true for the majority of other poll/response protocols and infrastructures in place today. (Note that M2M-Problem-0001 has already addressed the issue of information discovery of these types of protocols and did elude to a producer/consumer model and can be extended to many other legacy protocols as well.)
Fielded solution patterns
This problem and been around for decades and has many fielded solutions. Most solutions have proven to work well over the years but tend to be point in time solutions that become brittle and problematic over time. All of these solutions require additional compute resources, coding, configuration, and management overheads.
FEPs (Front End Processors – FEP were and still are very prevalent in these types of network infrastructures. FEP’s are basically standalone computer applications that are responsible for the polling of devices, validating responses, and then taking the resulting data stream and applying the transformations required to send the data to the next processing element in the food chain. Typically the next element would be the SCADA host system. Any change in downstream data consumers would typically result in the re-configuration if not application development on the FEP.
Relational Database Query – In many systems, the host SCADA/Telemetry application will place the resulting process data into a relational database for historical and archive purposes. This database can then be used as a data source for other consumers of the data. Again, this becomes problematic if additional data needs to be added to the schema that was not originally specified. In many cases, if the operations side of the business does not need the data, there may not be any interest in acquiring it in the first place.
Field Protocol Converters/Gateways – In many fielded solutions existing device equipment is front ended by a protocol converter or gateway locally. This could be considered some of the earliest “M2M Device Platforms” and has become a very viable solution to the problem. These devices can typically handle the local physical interface requirements (RS-232/485/422/Ethernet/CANbus/HART) as well as converting device data into a more common transport protocol. But many still deliver the resulting data stream in a proprietary format and still expect a host application to consume the results.
Referenced by problems