Meshtastic MQTT Series - Data Processing

March 7, 2025

In our last article in this series, we discussed data enrichment and how adding additional information, by merging with other sources of data or calculating new details, can support data processing. In particular with the Meshtastic data being processed, the addition of geospatial context data, such as country and city, makes it much easier to categorize and cluster data.

The prior post in the series is available at:

https://www.lemuridaelabs.com/post/meshtastic-mqtt-series---data-enrichment

Within the Mesh Scope processing pipeline, the last step is the processing and storage of mesh data. This provides capabilities such as various analytics and map-based displays. This last step focuses on any final adjustments to be made as data is stored and archived in the various systems most optimized for the data type itself.

The goal of this article is to provide insight into:

  • The types of data being stored and the proper mechanism in which to store it
  • The importance of choosing the right storage mechanism for data
  • How the storage ultimately impacts analytics and reporting

The final step of the pipeline is critical to ensure that the work performed thus far is usable and available in the form and with the performance required. This final step enables new events to be created based on the processed data, and for these events to create whole new workflows.

Building on top of this data processing is the Mesh Scope API and data stream layer. This layer provides direct access to stored information, either summarized results or simply processed information. In addition, the data stream portion of the API provides a real-time view of updates being received and processed in the system.

Data Processing Overview

After the earlier steps are complete, the data processing step starts with a focus on data storage, evaluation, and potentially responding to events of interest. This article will review each step in sequence to consider the different types of information received and the actions to be taken.


Mesh IoT Data Stream Processing

A note on the data processing, the goal of Mesh Scope is not to act as the world archive of Meshtastic data, but rather to provide some interesting capabilities and to show design considerations for systems processing unbounded IoT or other data streams. Arguably, the end-to-end process could be simplified, however the Meshtastic data provides a great foundation for thinking and solving challenges related to data stream processing, geospatial analytics, and data management.

Core Data Storage

The storing of the mesh information, ranging from node information, chat messages, and position reports, is captured for data storage in the primary relational database. The duration of some information, such as position reports and chat messages, is relatively short while node information is maintained for a longer time period.

The primary data stored in the relational database, Postgresql, is node information and last known position information, along with aggregate movement tracks. The update process is simple, either creating new records when Meshtastic updates are received, or updating fields on existing records. 

Postgresql is a powerful database and has features such as PostGIS, giving the database the ability to query geospatial data. This allows for searches like “all nodes within 25km of this point” or complex aggregations such as “all HELTEC_V3 Meshtastic nodes within a larger bounding box updated in the last week”. The updates to the database are made as updates are received from the Meshtastic MQTT broker, and a local cache is used to reduce some overhead in checking data updates.

Storage of this data can grow at a rapid and unbounded rate, and although modern databases are capable of holding a large amount of data, there is a limit on how much data we desire to keep “hot” online for Mesh Scope. To augment this, a data archiving process is also in place, detailed in a future article.

Timeseries Data Storage

Time series data is different than a traditional database, where records or documents are stored in a structure, and subsequently retrieved or queried. With time series data, events are captured at a specific point in time with various fields or attributes. Generally, time series data is retrieved as an aggregate, using calculations such as minimum, maximum, average, or count over time periods. For example, an engine temperature sensor in a vehicle could report coolant temperature every 30 seconds, and a time series database would capture these reports. An application could then look at current state, recent trends, and historical readings to better understand the health of the engine and the risk of overheating.

The Mesh Scope application receives and generates two main types of time series data. First, the application captures event data such as message data flow to look at the overall volume of messages being transmitted. These events are automatically generated upon receiving the messages and are used by functions such as flood detection. The second type of data is sensor data received from Meshtastic devices themselves, which include values such as temperature or battery level.

An example of a chart showing protocol-specific minute-level aggregate counts is shown below. In this we an see the number of messages processed each minute for the various message protocols received.

Timeseries Database Message Type Counts

Mesh Scope uses the InfluxDB time series database to store the various points, and can then query based on the attributes assigned to a record. For example, the sending node or topic are attributes on a message event, and so a time series query could provide a view of most active senders or topics based on message flows over time. Time series databases are very good at aggregations, and so it is very fast to aggregate the readings into time-based buckets. This makes it much easier to calculate message data flows in 5 minute intervals over a 6 hour period, or other type of summary.

Storing individual events does add up, and so time series databases provide the ability for temporal compaction, which is to say, the records can be aggregated farther back in time. To illustrate this, if we are storing messages from nodes and have 5 messages a minute, this level of detail may be of interest in the near term. Setting a policy, after two weeks those 5 individual records could be aggregated into 1 minute rollups, reducing the records into a single record. A second policy may further aggregate the data into 5 minute intervals, reducing 25 individual records into 1 after a month.

Working with unbounded streams of data means carefully planning the data management lifecycle, and time series data management is no exception. Using compaction techniques and functions allows a system to manage long-term data storage while maintaining an analytic data set.

Data Processing & Evaluation

When data is stored, it provides an opportunity to calculate changes over time, and Mesh Scope uses this to calculate a movement track log. A moment track is simply the position changes for a node over time, but this can be a bit tricky when working with GPS-based devices. Combining the variability in GPS reports with the reduced accuracy of Meshtastic public network reporting, there is a reasonable amount of jitter built in to the system. This is good for preserving privacy, but when calculating general movement tracks is a consideration.

For Mesh Scope the previous position is compared to the new report, and a distance is calculated between the two. To be technical the distance across a sphere is calculated as the earth is not flat, but the general result is a distance the device has moved since the prior report.

Mesh Node Filtered Movement Track

With these distance measures, Mesh Scope captures movements in a track archive, along with the calculated distance, to provide a historical movement filtered by meaningful movements. A device sitting still will appear to exhibit some movement given the jitter and variation noted earlier, so movement tracks are filtered based on the distance to reduce a “spiderweb” effect of devices bouncing around an area.

Within Mesh Scope, the system filters nominal movements to avoid clutter, and then provides a filter to only display movements above a certain threshold.

APIs and Streams

The processed and stored data creates the foundation for the APIs and data streams to function, and to power the web interface available. The web interface uses API endpoints in Mesh Scope to retrieve all current nodes, to get additional details of the nodes, and to retrieve recent movement track information.

In addition, the Mesh Scope interface provides a real-time view of data being received and processed, using the Server-Side Events (SSE) stream to see just how dynamic and active the world-side Meshtastic network is.

An example of watching the live stream is shown below. As an added bonus, you can listen to the page interpret the passing latitude/longitude coordinates into music, although it may not end up on your top songs. 

Real-Time Mesh Position Data Stream

Although the Mesh Scope project is somewhere between experimental and a demonstration, if others wish to consume the streams they can certainly be available for that. Just let us know!

Event Synthesis

The last topic is the ability of this processing pipeline to identify actions or events of interest, and to create new related actions, alerts, or workflows. In many systems, the geospatial processing might be used to indicate material out of area or sensor values out of an acceptable range. In this case we aren’t evaluating the properties of the various Meshtastic nodes, but we are watching for trigger events.

Lemuridae Labs has a general agent framework integrated into the message processing services, allowing a mesh user to send commands to the agent and to receive out of band alerts or information. This will be more fully detailed in a later post, but essentially user may direct the agent, Qubit, to monitor their location and to notify when an event of interest occurs. This notification does not occur via the Meshtastic network, as we do not want to cause problems injecting data in, but is an interesting case study of unifying IoT data streams with agentic systems.

Building Success,
One Project at a Time.
Today is the day we can build something together, expanding and collaborating to create something new.
Start Now