Skip to main content
UlexIoTy
Conduitby UlexIoTy
Features
OT Engineers
Query data across historians
IT Directors
Security-first data access
Plant Managers
Real-time operational KPIs
Division Directors
Multi-facility visibility
Routing Intelligence
AI-learned decision routing
All Solutions
View all roles
Use Cases
Blog
Insights and tutorials
ROI Calculator
Calculate your savings
Glossary
Industrial data terminology
ContactRequest Demo
Features
Use Cases
ContactRequest Demo

Footer

UlexIoTy

Conduit — Industrial Context Mesh

The Industrial Context Mesh that adds meaning to your OT data without moving it.

Meaning without movement.

Product

  • Features
  • How It Works
  • Integrations

Resources

  • Use Cases

Company

  • About
  • Contact

Legal

  • Privacy
  • Terms

© 2026 UlexIoTy LLC. All rights reserved.

Back to Blog
TechnicalArchitectureTutorial

Understanding Cross-Source Queries in Industrial Data

A deep dive into how cross-source query execution works and why it's the key to unlocking insights from distributed OT systems.

Grey DziubaJanuary 15, 20264 min read
Understanding Cross-Source Queries in Industrial Data

One of the core innovations in Conduit is our cross-source query engine. In this post, we'll explore what cross-source queries are, how they work, and why they're essential for modern industrial data architectures.

The Traditional Approach: Centralization

Historically, when organizations wanted to analyze data from multiple systems, they followed a predictable pattern:

  1. Extract data from source systems
  2. Transform it into a common format
  3. Load it into a central data warehouse or lake

This ETL (Extract, Transform, Load) approach has been the standard for decades. But in industrial environments, it creates significant challenges:

  • Latency: Batch ETL means your "current" data is always hours or days old
  • Cost: Moving and storing petabytes of time-series data is expensive
  • Governance: Duplicated data creates compliance and security concerns
  • Maintenance: ETL pipelines are fragile and require constant attention

Enter Cross-Source Queries

Cross-source query execution flips this model on its head. Instead of moving data to where the query runs, we move the query to where the data lives.

How It Works

When you submit a query to Conduit, here's what happens:

1. Parse the query and identify required data sources
2. Generate optimized sub-queries for each source system
3. Execute sub-queries in parallel against source systems
4. Stream results back to the merge layer
5. Merge, correlate, and return unified results

Let's walk through a concrete example.

Example: Cross-System Correlation

Suppose you want to correlate temperature readings from your Splunk historian with alarm events from MQTT:

SELECT
  p.timestamp,
  p.temperature,
  a.alarm_type,
  a.severity
FROM splunk.temperatures p
JOIN mqtt.alarms a
  ON p.asset_id = a.asset_id
  AND p.timestamp BETWEEN a.start_time AND a.end_time
WHERE p.timestamp > NOW() - INTERVAL '24 hours'

Conduit breaks this into two parallel operations:

Sub-query 1 (Splunk):

SELECT timestamp, temperature, asset_id
FROM temperatures
WHERE timestamp > NOW() - INTERVAL '24 hours'

Sub-query 2 (MQTT):

SELECT alarm_type, severity, asset_id, start_time, end_time
FROM alarms
WHERE start_time > NOW() - INTERVAL '24 hours'

These execute simultaneously. Results stream back to Conduit, where the join operation correlates records by asset and time window.

Query Optimization

Naive cross-source execution would be slow. The key to performance is intelligent query planning:

Predicate Pushdown

Filter conditions are pushed to source systems, reducing data transfer:

Original: SELECT * FROM splunk.temps WHERE value > 100
Pushed:   Splunk executes "value > 100" filter locally

Projection Pruning

Only requested columns are retrieved:

Original: SELECT temperature FROM splunk.readings
Pruned:   Splunk returns only temperature column, not all 50 columns

Join Reordering

Joins are executed in the optimal order to minimize intermediate result sizes.

Parallel Execution

Independent sub-queries execute in parallel across source systems.

Handling Heterogeneous Data

Industrial systems store data differently:

  • Historians use time-series models (timestamp, tag, value)
  • SCADA uses event-driven models (state changes)
  • MES uses relational models (orders, batches, products)

Conduit's semantic layer maps these different models to a unified schema. When you query "temperature for asset X", Conduit knows:

  • In Splunk, this is index ot_data with field T-101.PV
  • In MQTT, this is topic building1/reactor1/temperature
  • In the OPC-UA server, this is node ns=2;s=Building1.Reactor1.Temperature

Performance Characteristics

Cross-source queries have different performance characteristics than centralized queries:

| Aspect | Centralized | Cross-Source | | -------------- | ------------------ | --------------------- | | Query latency | Lower (local data) | Higher (network hops) | | Data freshness | Batch delayed | Real-time | | Storage cost | High (copies) | Low (no copies) | | Governance | Complex | Simple |

For most operational queries, the slight latency increase is worth the benefits of real-time data and simplified architecture.

When to Use Cross-Source Queries

Cross-source queries excel for:

  • Operational dashboards requiring real-time data
  • Ad-hoc analysis across multiple systems
  • Compliance queries where data residency matters
  • Integration without ETL pipelines

They're less suitable for:

  • Heavy analytics requiring repeated scans of historical data
  • Machine learning training on large datasets

For these use cases, consider using Conduit to populate a purpose-built analytics store.

Conclusion

Cross-source query execution is a paradigm shift in industrial data architecture. By moving queries to data instead of data to queries, organizations can get real-time insights without the cost and complexity of centralized data lakes.

Want to see cross-source queries in action? Request a demo and we'll show you cross-system correlation on your own data.

Want to learn more about how Conduit can transform your industrial data landscape?

Request a demo