A key to IT-OT convergence is device data management solutions for effectively monitoring, processing and managing large amounts of data from IoT devices. Traditional embedded database solutions fall short in understanding and fulfilling the sophisticated data processing and management requirements of IoT devices.
Smart sensors and devices are becoming an important part of the Internet of Things, IoT, and are continuously changing the way we automate tasks. We employ intelligent systems to improve production in factories, manage smart home energy to monitor and reduce energy costs, build industrial automation systems to replace human assignments, and develop autonomous transportation to improve driver safety.
Inside these embedded systems are sensors which rapidly transmit data that must be immediately captured, processed, and acted on. But how are we able to capture, process and manage the flow of massive amount of data continuously coming to the system and empower devices to make decisions or take actions?
Traditional embedded database solutions fall short in understanding and fulfilling the sophisticated data processing and management requirements of IoT devices. IoT edge database solutions designed to understand the continuous stream of data produced by sensors enable devices to make important decisions in microseconds. What are the available device data management solutions for IoT devices to monitor, process, and manage data?
In this article, we will review some embedded IoT device data management options available for edge devices and highlight the distinct design primitives’ approach for addressing device data processing.
IoT edge device data challenges
First, what is an embedded system? An embedded system is a device that performs tasks automatically through self-learning and self-management, which often connects to other systems.
These systems are starting to use the IoT to improve lives. But as a significant volume of data accumulates on each connected device, a comprehensive approach to data management is required.
Across these embedded systems, the primary challenge is to monitor, capture and process the data to intelligently ensure safe behavior and fault‐free operation of the devices. More than simply streaming data and receiving commands, these devices run complex, high level software programs that operate with or without a network and cloud connection.
Meanwhile, devices embedded in these systems must handle large transactions for various tasks and need to be able to connect to each other on multiple networks. Therefore, IoT data management needs to be divided into real-time interaction with objects, or things, as well as offline mass storage and long-term trend analysis.
In the real world, device manufacturers seek a scalable edge data management solution to deploy hundreds of IoT devices, so each can collect, analyze, and manage the flood of data that IoT sensors produce, without losing performance. These devices do not need to permanently store all real time data but must capture and store critical information. Simultaneously, each IoT node must make independent decisions that trigger appropriate reactions.
Database queries enable device applications to gain the intelligence to make informed decisions in real time: efficiently and without delay. Succeeding on the IoT requires both the right data management software and the capability to quickly collect and connect device data at the right throughput rate to achieve low latency.
Traditional vs. IoT data management
Microchips and devices are now, more than ever, used to build autonomous systems that collect data and gain intelligence. When it comes to data management on these systems, a device application may connect directly to the cloud, buffer data on the device until connectivity is available, or manage data directly on the device. Possible candidates for local device data management include storing arrays and data streams in flat files, creating tables with data management software, or developing a custom data storage solution.
Traditional storage-centric embedded databases generally manage the storing, retrieving, and updating IoT data with records, tables, and files. In the context of IoT, data management software must start by monitoring data in real time, while also providing storage and logging options for future analysis. Data-centric solutions expand the role of data management from simple data storage and analysis to include online data monitoring, continuous query processing, intelligent filtering, and automatic data distribution, in real time.
or example, flat file formats such as CSV, JSON and XML are notoriously difficult to update and search, especially for the large volume of data typical of autonomous systems. But how can a device avoid concurrency race conditions and data corruption? Developing a custom solution to index information is not a trivial task and would essentially drive you to become your own database provider! Data management system software updates and new database features each introduce new challenges for application developers.
IoT data processing and management
Common data management solutions currently available for devices do not fully address to the complexity of architecting software for IoT data processing on the edge. Sensors are the primary source of data, but they are constrained by their limitations from facilitating sophisticated analysis.
The focus of data analysis and management on the Internet of Things is to harvest real time information and make sense of data in a very limited time, even without permanent storage available to save what is important to keep or reject what isn’t. A good solution seamlessly extends technologies familiar to many developers, such as SQL, to the new problem of analyzing IoT sensor on directly on edge devices.
Animal healthcare is an interesting sector that can employ and benefit from IoT systems. Devices now monitor feeding activities for cattle, horses, and other livestock to determine the most efficient milk yield, weight control, and best health care practices. In addition, a prediction algorithm can continuously collect sensor readings to monitor animals’ temperatures 7×24. These goals require processing time series data in real time.
Animal healthcare systems, embedded with IoT devices, are expected to search for feeding instructions, capture temperature for different animals, and track feeding history. Data may be communicated with a central cloud data management location, but it also must be indexed on each animal and system individually. The instructions given by the RFID sensors on animal body (i.e. ears) issues immediate alert and communicates with other part of the system within.
After obtaining the data by an RFID reader, the database must be robust enough to continuously monitor, catalog and retrieve data and prepare the system locally for feeding information/health, as well as temperature data management challenges, even when there is failure such as power loss. The main goal is to monitor the real food consumption of animals as well as abnormal temperature behavior to monitor and guarantee health and prosperity. High performance concurrent read and write is a common characteristic of such an IoT data management.
For this scenario, animals need to be monitored in real time, communications must be handled with a monitoring module and obtained information must yield to immediate act or storage of the data locally on the device. In addition, as explained earlier, each animal is connected via RFID to a sensor, and sensors are connected to an edge device to be remotely, on premises, for data analysis.
In these embedded systems, low power consumption must work with various desired microcontrollers or applications processors, and the system must be programmed for data acquisition according to the integration period desired, and the data is sent wirelessly to the central device in each facility. Therefore, a database for distributed embedded data across heterogeneous devices is required.
As a company often has different global sites, the number of edge devices that are connected to each network is practically unlimited, and each new site can add new device modules as required. Additionally, each device can be programmed to manage the data per each location’s particular needs.
IoT data management framework
Process time series data in real time: Through years of engagement with customers, we recognized special needs for IoT edge data management. Therefore, we decided to implement a new generation of embedded database software with a special focus on IoT data streaming, processing, and management. We recognized that sensors need to collect and aggregate a massive amount of data, which must be queried in real time, and for which results must be shared with other embedded systems. We decided to handle this common scenario by implementing two engines: one for data management and the other for IoT stream processing. In our framework, we paid close attention to the benefits and tradeoffs between the cost of data processing and storage on the edge in comparison to fundamentally utilizing entire cloud data processing and management approach.
High performance concurrent read/write on flash media: Due to the nature of IoT systems interaction with real time data, which is generally fresh information that requires deterministic online analysis, as well as collection for future analysis requirements, both processing and management tiers made sense. Therefore, we designed our database to be an optimization engine to aggregate a large volume of data collected from the sensors, find its detailed value, and save important information while other garbage data is automatically discarded. This aggregation is aimed toward minimizing data storage and maintenance cost for embedded systems. We also added a data distribution layer that enables other parts of the system to receive information from both the data processing and data management engines.
We also developed a union layer which is enables different sensors’ streams to join from various desired sources and direct them for further processing.
Distribute embedded data across heterogeneous devices: We designed our solution to be highly available, so when there is a failure due to any reason, a single crash cannot stop the system and real time data management operation can continue. Although archiving and backing up data is an option, that approach is not fully capable for mission critical systems. High availability makes the data record recoverable and provides peace of mind against general failures.
Security is another important factor that we had to pay close attention to. It has been estimated that device application security concerns are expected to remain an important and frequent cause of confirmed breaches. With devices increasingly collecting, storing, and connecting critical data, the risk of a breach grows continuously.
Meanwhile, manufacturers building embedded systems for various markets—including industrial automation, medical devices, power grid, or transportation—all largely face the same device security challenges, as they need to harness the incredible power of data management and connected computing.
SQL injection is one of the most common database attacks. This involves the injection of malicious code into a device to execute malicious queries. Another vulnerability that accelerates injection attacks is when the database is not adequately isolated from the running code. Though isolation may reduce some of these attacks, it is better to look for alternatives.
While the ITTIA DB SQL cockpit tool, know as Web Console, allows developers and end-users to monitor database activities, ITTIA DB Security Agent, DB SEAL, monitors those database activities automatically, isolates databases stored on devices, decides between alternatives to mitigate an attack, and keeps the database contents always available. This proactive monitoring of the data and database by our agent will allow the device to issue an alert, block access, or shut down when data management metrics fall out of the expected range. This is a virtual agent that monitors database responsibilities and metrics in real time, and responds when there is an outage, or other security concern.
At every stage in building a device application, developers grapple with decisions around the best data management software for a successful development and launch of their edge centric IoT system. Such decisions as selection of database software consume significant development and validation time and cost.
Intelligent systems in the IoT age collect and analyze mass quantities of data throughout the system. These valuable findings should become accessible to other edge devices and embedded systems. As the most common IoT data management use case, devices produce a large quantity of continuous raw data which must be collected, analyzed, and distributed. Devices require a data management framework that empowers local applications to process data, capture events and share important information to many other devices deployed on the IoT edge network.
ITTIA DB SQL© is an ideal data management product for manufacturers developing Internet of Things solutions that must collect data from sensors nodes for real time and historical data management.
Sasan Montaseri, Founder, ITTIA