IoT Edge Driver Implemented with Modbus TCP

IoT Edge Driver Implemented with Modbus TCP

1. Background of the project and reasons for choosing Modbus TCP

The first challenge I faced while conducting the IoT project was reliably collecting data from various equipment on site.

The site had heterogeneous equipment with different manufacturers and communication methods, such as PLCs, sensors, and controllers, and it was necessary to collect the status data of each device in real-time and transmit it to the upper platform.

In industrial sites, various industrial protocols such as OPC-UA and LS XGT are used. Among them, I will share my experience in implementing an IoT edge communication driver based on the widely used Modbus TCP communication.

Modbus is a representative industrial protocol that has a simple structure and has been used in industrial sites for a long time. Especially since it is supported by various devices regardless of the manufacturer, its applicability in the field is very high.

While the existing Modbus RTU operated on a serial basis, Modbus TCP has the advantage of speed and scalability as it operates in a TCP/IP-based Ethernet environment.

In the project, it was crucial to collect data from dozens of facilities at a millisecond level rather than a second level, making network efficiency and processing performance very important.

In particular, since the edge environment can lose network connectivity at any time, data collected during offline network states must not be lost. Additionally, due to the characteristics of the edge environment, computing resources were limited, so CPU and memory usage also had to be considered together. As a result, the project designed the communication driver based on the following objectives.

- High-speed data collection in milliseconds

- Prevention of data loss in case of network disconnection

- Real-time data delivery structure implementation

- Efficient utilization of limited edge resources

- Reliable Write support for facility control

- Providing a flexible integration structure with the upper platform

This project was not just about parsing Modbus packets, but about implementing a data collection architecture that can operate reliably in real industrial environments.

2. Understanding the Modbus TCP protocol

It was necessary to understand the structure of the Modbus TCP protocol before entering the implementation process.

Modbus fundamentally follows a Client-Server architecture. Typically, the Master (Client) sends a request and the Slave (Server) responds.

For example, PLCs or sensor devices act as Slaves, while IoT Gateways or data collection applications act as Masters. Modbus operates by reading or writing values stored at specific addresses.

The following function codes were typically used.

- Holding Register Read

- Input Register Read

- Coil Read

- Single Register Write

- Multiple Register Write

In the project, we queried data based on Holding Registers and also used Write functionality for controlling some equipment. Modbus TCP adds an MBAP (Modbus Application Protocol) Header to the front of the existing Modbus RTU packets.

The MBAP Header contains the following information.

- Transaction Identifier

- Protocol Identifier

- Length

- Unit Identifier

This allowed for the reliable differentiation of multiple requests even in TCP-based environments. Especially since it operates on TCP, it was faster than conventional Serial communication and provided greater flexibility in network configuration. In the project, it was crucial to have an asynchronous TCP processing structure as we needed to handle requests for multiple facilities simultaneously.

Additionally, some equipment experienced response delays or interruptions at certain times, so we also considered timeout processing and retry structures.

As a result, the key challenge became designing a reliable communication structure that takes into account not only simple protocol implementation but also network failure situations.

3. Asynchronous Data Collection Architecture

One of the most important requirements of the project was extremely fast data collection at the millisecond level.

If a synchronous structure is used where the upper application sends read requests to the equipment every time it needs data, it will be difficult to achieve the desired performance due to network I/O wait times.

In particular, the field equipment had inconsistent response times, and instantaneous network delays occurred frequently.

To solve this problem, the data collection structure has been designed based on an asynchronous polling architecture.

The structure is mainly composed of the following stages.

1. Perform asynchronous polling

2. Generate event

3. MQTT Publisher delivery

4. Upper platform Publish

First, the internal Polling Thread continuously performed asynchronous Read requests to the Modbus equipment at set intervals. The collected data was not simply sent immediately but was converted into Event objects and then passed to an internal Queue.

After that, an independently operating MQTT Publisher was set up to consume the Queue and publish data to the MQTT Broker. The biggest advantage of this structure was that it completely separated the Read and Publish processes.

Even if the MQTT publish speed slows down, the polling thread itself is not affected, allowing for stable maintenance of the data collection cycle. Additionally, we were able to reduce coupling between threads and enable independent scaling of each processing stage.

For example, we designed it so that even if the structure changes to a Kafka-based publish in the future, the polling structure itself can remain intact. In the actual operating environment, we had to perform polling simultaneously on dozens of devices, so we applied a thread pool-based structure and continuously monitored CPU and memory usage while also conducting optimization efforts.

4. Data Loss Prevention and MQTT Fault Tolerance

In IoT edge environments, network failures occur very frequently. Especially in factory environments, network quality is inconsistent, and due to external interference, the MQTT Broker connections are often interrupted momentarily.

In the initial implementation structure, there was a possibility that all data inside the Queue could be lost if the MQTT connection was interrupted. To address this issue, an additional fault-tolerant structure was designed.

First, it was implemented to store the data in a separate InfluxDB without immediately discarding it in case of an MQTT Publish failure.

In other words, the real-time Publish failure data has been configured to be loaded into a time-series database that serves as a temporary storage.

Then, when the MQTT connection is restored, the stored data is read again and retransmitted. This structure minimized data loss even in the event of network failures.

To prevent the issue of past data being mixed with real-time data after a failure recovery, a separate MQTT Topic has been used.

For example, real-time data is published to the .../REALTIME/... Topic, while failure recovery data is published to the .../OFFLINE/... Topic.

Through this, the top platform was able to clearly differentiate the nature of the data and did not affect the real-time analysis logic. As a result, the project was able to establish a structure that could reliably preserve data not only during simple real-time processing but also in the event of failures.

5. Event-driven Write processing and ensuring stability

Unlike data reading, the Write function for equipment control requires much higher reliability. Write requests that change the status of equipment directly affect the actual operation of the machines, unlike simple data reading.

For example, commands like starting and stopping equipment, controlling valves, and motor control can lead to production equipment failures or safety accidents if handled incorrectly. For this reason, the Write function was implemented in a synchronous rather than an asynchronous manner.

When a Write request occurs, the Modbus Write request is executed immediately, and the request Thread is configured to wait for a normal response (ACK) from the equipment.

There were two main reasons for choosing this structure.

The first was a clear guarantee of causality.

The equipment control logic typically operates in a sequential control manner. That is, the next stage of control could only be performed once the previous Write operation was successfully completed.

If processed asynchronously, there was a risk that the next control logic would be executed first even though the actual equipment status had not yet changed.

The second was immediate error handling.

In case of a Write failure, the upper system must immediately recognize the failure situation. In the case of Modbus, the normal response to a Write request is always the Echo value of the requested data. Therefore, even if a response is received indicating that the Write request was successfully executed, it was necessary to verify whether the value was actually reflected. To do this, after receiving the Write request response, the actual register value was checked again to confirm that the requested value was accurately reflected. Through this structure, more stable equipment control could be performed even in actual operating environments.

6. Conclusion

This experience of implementing an IoT communication driver based on Modbus TCP carried more significance than just simple protocol development.

Initially, I thought it was simply about sending and receiving Modbus packets, but as I worked on the actual project, I realized that the most important aspect was how reliably data could be collected and transmitted within a limited edge environment.

These elements were particularly important.

- Asynchronous high-speed data collection

- Network failure response structure

- Data loss prevention

- Stable equipment control

- Optimization of limited resources

- Balancing real-time processing and stability

In actual industrial sites, network quality is not always consistent, and the response speed of equipment is also unpredictable. Therefore, how reliably it can operate in the event of a failure was much more important than simply implementing functions.

Through this project, I was able to gain experience in designing various IoT edge systems such as polling structure design, asynchronous event handling, MQTT-based data delivery, and fault recovery architecture.

In the future, I expect to extend similar architectures not only in Modbus but also in various industrial protocol environments such as OPC-UA, MQTT Sparkplug, and BACnet.

This project reaffirmed the importance of architectural design that considers the field environment and operational stability rather than just a simple technology stack, especially in IoT environments.

chnsik

Site footer