Natural Query Engine (NQE)

The Natural Query Engine (NQE) is Conduit's interface for querying industrial data using natural language. NQE combines Golden Templates (learned patterns) with multi-LLM AI to deliver reliable, auditable query compilation.

Design Philosophy

AI-Powered with Human Control

NQE uses a hybrid approach that combines:

Golden Templates: High-confidence learned patterns from user corrections
Multi-LLM AI: Pluggable AI providers (Claude, OpenAI, Azure, Ollama) for novel queries
Organizational Context: Department and role-based query filtering
Confidence Scoring: Every compilation includes a confidence score

This means you get the convenience of natural language with full control over what executes. You always review and can adjust queries before they run.

Golden Template System

Golden Templates are Conduit's learning mechanism:

When users correct or improve a compiled query, the correction is captured
Similar future queries match against the template with a confidence score
High-confidence matches skip LLM entirely for faster, more reliable results
Templates improve over time as your team uses them

Query Structure

NQE queries follow a consistent pattern:

[Action] [Measure] for [Subject] [Time Range] [Filters]

Actions

| Action | Description | Example | | --------- | ------------------------------------ | ---------------------------- | | Show | Display current or historical values | "Show temperature..." | | What is | Get current value | "What is the pressure..." | | Compare | Side-by-side comparison | "Compare temperatures..." | | Trend | Time-series visualization | "Trend flow rate..." | | Alert | When values crossed thresholds | "Alert when temperature..." | | Correlate | Cross-source relationships | "Correlate pressure with..." |

Measures

Measures are the values you want to retrieve. NQE supports:

Tag names: Direct references like "Tank1_Temperature"
Semantic terms: "temperature," "pressure," "flow rate"
Calculated values: "average," "maximum," "rate of change"

Subjects

Subjects identify what equipment or area you're asking about:

Equipment: "Tank 1," "Pump 23," "Line 4"
Areas: "Building A," "Zone 3," "Production floor"
Systems: "HVAC," "Cooling system," "Main process"

Time Ranges

| Format | Example | | -------- | ----------------------------------------- | | Relative | "last hour," "past 24 hours," "yesterday" | | Absolute | "from 8am to 5pm," "January 15th" | | Range | "between 6am and noon" | | Current | "now," "current" |

Query Examples

Basic Queries

What is the current temperature of Tank 1?

NQE compiles to SPL:

index=ot_data tag="Tank1_Temperature" earliest=-5m | stats latest(value) as
current_value

Confidence: 0.95 (Golden Template match)

Show pressure readings for Pump 23 over the last hour

NQE compiles to SPL:

index=ot_data tag="Pump23_Pressure" earliest=-1h | timechart span=1m avg(value)

Aggregation Queries

What was the average flow rate for Line 4 yesterday?

NQE compiles to SPL:

index=ot_data tag="Line4_FlowRate" earliest=-1d@d latest=@d | stats avg(value)
as avg_flow

Cross-Source Queries

Correlate cooling water temperature with compressor vibration over the last 24
hours

This query spans multiple data sources. Conduit automatically:

Identifies relevant sources (Splunk for historical data, MQTT for live telemetry)
Queries each source in parallel
Uses DuckDB to time-align and correlate the results

Alert Queries

When did temperature exceed 180�F for Tank 1 this month?

NQE compiles to SPL:

index=ot_data tag="Tank1_Temperature" value>180 earliest=-30d | table _time
value

Semantic Resolution

When you use semantic terms like "temperature" instead of exact tag names, NQE performs semantic resolution using pgvector:

Resolution Process

Hybrid Search: BM25 keyword + pgvector semantic similarity
Context Matching: Filter by subject (equipment/area) and organizational context
Ranked Results: Return top matches with confidence scores
Disambiguation: If ambiguous, prompt for clarification

Example

Query: "Show temperature for Tank 1"

NQE finds these candidate tags:

| Tag | Score | Source | | ------------------ | ----- | ----------- | | Tank1_Temperature | 0.95 | Splunk | | Tank1_TempProbe_1 | 0.87 | MQTT | | Tank1_Temp_Ambient | 0.72 | MCP Gateway |

If the top match is above the confidence threshold (default: 0.85), it's selected automatically. Otherwise, NQE presents options for the user to choose.

Multi-LLM Support

NQE supports multiple LLM providers through a pluggable architecture:

| Provider | Configuration | Best For | | ------------------ | --------------------- | ------------------------------------- | | Claude (Anthropic) | LLM_PROVIDER=claude | Primary recommendation, best accuracy | | OpenAI | LLM_PROVIDER=openai | Alternative cloud provider | | Azure OpenAI | LLM_PROVIDER=azure | Enterprise Azure environments | | Ollama | LLM_PROVIDER=ollama | Self-hosted, air-gapped deployments | | Mock | LLM_PROVIDER=mock | Testing and development |

Organizational Context

NQE filters queries based on organizational context:

Department scoping: Engineers see tags relevant to their department
Role-based access: Operators vs. managers see different data granularity
Site filtering: Multi-site deployments filter by user's assigned sites

Voice Interface

NQE is designed for voice-first interaction. Operators can ask questions hands-free while working on the plant floor.

Voice Query Tips

Speak naturally - NQE handles filler words and hesitations
Use equipment names as they appear on your HMI
Time ranges can be conversational: "this morning," "since lunch"

Validation & Error Handling

Validation Steps

Template Check: Does this match a Golden Template?
Catalog Check: Referenced tags/equipment exist
Permission Check: User has access to requested data
Feasibility Check: Query can be executed (e.g., time range is valid)

Common Errors

| Error | Cause | Solution | | ------------------- | -------------------------------------- | --------------------------------------- | | "Unknown equipment" | Subject not in catalog | Check equipment name spelling | | "Ambiguous tag" | Multiple matches, none high-confidence | Be more specific or select from options | | "No data available" | Tag exists but no data in range | Expand time range or check source | | "Permission denied" | User lacks access | Contact administrator |

Best Practices

Be Specific

"Show me the data"
"Show temperature for Tank 1 over the last hour"

Use Equipment Names

"What's the temp over there?"
"What is the temperature of Reactor 3?"

Specify Time Ranges

"Show pressure history"
"Show pressure for Pump 12 over the last 24 hours"

Use Aggregations for Large Ranges

"Show all temperatures for the last month"
"Show daily average temperatures for the last month"

Multi-Source Execution

NQE queries are deterministic to the Unified Namespace. Every field in your query maps 1:1 to a UNS path, which maps to a physical data source. The query planner:

Resolves each NQE field to a UNS node with source bindings
Groups fields by their bound data source (Splunk, MQTT, OPC-UA)
Builds a DAG with parallel execution nodes per source
Executes all source queries simultaneously
Correlates results through DuckDB compute nodes

Partial Results

If a source is slow or unavailable, the planner returns partial results from available sources with status: partial. Results update as remaining sources respond. This "progress over perfection" approach means you always get the fastest possible answer.

Query Plan Caching

Identical query topologies are cached in Redis using SHA-256 hashing. Repeat queries with the same field structure resolve instantly without rebuilding the DAG.

Query Corrections

When you correct a query result, the correction feeds into the Golden Template system:

Corrections are stored with pattern group hashes for clustering
When multiple corrections from different users match the same pattern, the system proposes auto-promotion
Admin approval converts correction patterns into Golden Template candidates
Confidence scoring considers: frequency (25%), user diversity (30%), consistency (25%), recency (10%), acceptance rate (10%)
Future similar queries benefit from the learned pattern automatically

NQE in the Conduit Platform

NQE is one of three core capabilities in the Conduit platform:

NQE (this page): Natural language queries across all data sources
Context Engine: Builds organizational knowledge from query patterns
AI Collaboration: Routes questions to domain experts

When used together, the Context Engine learns from NQE usage patterns to suggest better queries and route questions to the people most qualified to answer them.

Next Steps

Context Engine - How NQE feeds organizational knowledge
AI Collaboration - Route questions to domain experts
API Reference - Use NQE programmatically
Architecture - How NQE fits into Conduit
Adapters - Configure data source connections

Natural Query Engine (NQE)

Design Philosophy

AI-Powered with Human Control

NQE uses a hybrid approach that combines:

Golden Templates: High-confidence learned patterns from user corrections
Multi-LLM AI: Pluggable AI providers (Claude, OpenAI, Azure, Ollama) for novel queries
Organizational Context: Department and role-based query filtering
Confidence Scoring: Every compilation includes a confidence score

This means you get the convenience of natural language with full control over what executes. You always review and can adjust queries before they run.

Golden Template System

Golden Templates are Conduit's learning mechanism:

When users correct or improve a compiled query, the correction is captured
Similar future queries match against the template with a confidence score
High-confidence matches skip LLM entirely for faster, more reliable results
Templates improve over time as your team uses them

Query Structure

NQE queries follow a consistent pattern:

[Action] [Measure] for [Subject] [Time Range] [Filters]

Actions

Measures

Measures are the values you want to retrieve. NQE supports:

Tag names: Direct references like "Tank1_Temperature"
Semantic terms: "temperature," "pressure," "flow rate"
Calculated values: "average," "maximum," "rate of change"

Subjects

Subjects identify what equipment or area you're asking about:

Equipment: "Tank 1," "Pump 23," "Line 4"
Areas: "Building A," "Zone 3," "Production floor"
Systems: "HVAC," "Cooling system," "Main process"

Time Ranges

Query Examples

Basic Queries

What is the current temperature of Tank 1?

NQE compiles to SPL:

index=ot_data tag="Tank1_Temperature" earliest=-5m | stats latest(value) as
current_value

Confidence: 0.95 (Golden Template match)

Show pressure readings for Pump 23 over the last hour

NQE compiles to SPL:

index=ot_data tag="Pump23_Pressure" earliest=-1h | timechart span=1m avg(value)

Aggregation Queries

What was the average flow rate for Line 4 yesterday?

NQE compiles to SPL:

index=ot_data tag="Line4_FlowRate" earliest=-1d@d latest=@d | stats avg(value)
as avg_flow

Cross-Source Queries

Correlate cooling water temperature with compressor vibration over the last 24
hours

This query spans multiple data sources. Conduit automatically:

Identifies relevant sources (Splunk for historical data, MQTT for live telemetry)
Queries each source in parallel
Uses DuckDB to time-align and correlate the results

Alert Queries

When did temperature exceed 180�F for Tank 1 this month?

NQE compiles to SPL:

index=ot_data tag="Tank1_Temperature" value>180 earliest=-30d | table _time
value

Semantic Resolution

When you use semantic terms like "temperature" instead of exact tag names, NQE performs semantic resolution using pgvector:

Resolution Process

Hybrid Search: BM25 keyword + pgvector semantic similarity
Context Matching: Filter by subject (equipment/area) and organizational context
Ranked Results: Return top matches with confidence scores
Disambiguation: If ambiguous, prompt for clarification

Example

Query: "Show temperature for Tank 1"

NQE finds these candidate tags:

| Tag | Score | Source | | ------------------ | ----- | ----------- | | Tank1_Temperature | 0.95 | Splunk | | Tank1_TempProbe_1 | 0.87 | MQTT | | Tank1_Temp_Ambient | 0.72 | MCP Gateway |

If the top match is above the confidence threshold (default: 0.85), it's selected automatically. Otherwise, NQE presents options for the user to choose.

Multi-LLM Support

NQE supports multiple LLM providers through a pluggable architecture:

Organizational Context

NQE filters queries based on organizational context:

Department scoping: Engineers see tags relevant to their department
Role-based access: Operators vs. managers see different data granularity
Site filtering: Multi-site deployments filter by user's assigned sites

Voice Interface

NQE is designed for voice-first interaction. Operators can ask questions hands-free while working on the plant floor.

Voice Query Tips

Speak naturally - NQE handles filler words and hesitations
Use equipment names as they appear on your HMI
Time ranges can be conversational: "this morning," "since lunch"

Validation & Error Handling

Validation Steps

Template Check: Does this match a Golden Template?
Catalog Check: Referenced tags/equipment exist
Permission Check: User has access to requested data
Feasibility Check: Query can be executed (e.g., time range is valid)

Common Errors

Best Practices

Be Specific

"Show me the data"
"Show temperature for Tank 1 over the last hour"

Use Equipment Names

"What's the temp over there?"
"What is the temperature of Reactor 3?"

Specify Time Ranges

"Show pressure history"
"Show pressure for Pump 12 over the last 24 hours"

Use Aggregations for Large Ranges

"Show all temperatures for the last month"
"Show daily average temperatures for the last month"

Multi-Source Execution

NQE queries are deterministic to the Unified Namespace. Every field in your query maps 1:1 to a UNS path, which maps to a physical data source. The query planner:

Resolves each NQE field to a UNS node with source bindings
Groups fields by their bound data source (Splunk, MQTT, OPC-UA)
Builds a DAG with parallel execution nodes per source
Executes all source queries simultaneously
Correlates results through DuckDB compute nodes

Partial Results

Query Plan Caching

Identical query topologies are cached in Redis using SHA-256 hashing. Repeat queries with the same field structure resolve instantly without rebuilding the DAG.

Query Corrections

When you correct a query result, the correction feeds into the Golden Template system:

Corrections are stored with pattern group hashes for clustering
When multiple corrections from different users match the same pattern, the system proposes auto-promotion
Admin approval converts correction patterns into Golden Template candidates
Confidence scoring considers: frequency (25%), user diversity (30%), consistency (25%), recency (10%), acceptance rate (10%)
Future similar queries benefit from the learned pattern automatically

NQE in the Conduit Platform

NQE is one of three core capabilities in the Conduit platform:

NQE (this page): Natural language queries across all data sources
Context Engine: Builds organizational knowledge from query patterns
AI Collaboration: Routes questions to domain experts

When used together, the Context Engine learns from NQE usage patterns to suggest better queries and route questions to the people most qualified to answer them.

Next Steps

Context Engine - How NQE feeds organizational knowledge
AI Collaboration - Route questions to domain experts
API Reference - Use NQE programmatically
Architecture - How NQE fits into Conduit
Adapters - Configure data source connections