· analytics  · 8 min read

Unlocking Real-Time Analytics with Joule + DuckDB

The Joule + DuckDB integration unlocks advanced streaming analytics by embedding a high-performance, in-memory database directly into the Joule runtime.

The Joule + DuckDB integration unlocks advanced streaming analytics by embedding a high-performance, in-memory database directly into the Joule runtime.

Introduction

In this post, we explore how Joule integrates DuckDB’s in-memory engine to deliver real-time streaming analytics within a single, unified runtime eliminating pipeline complexity and accelerating insights.

The Joule + DuckDB integration unlocks advanced streaming analytics by embedding a high-performance, in-memory database directly into the Joule runtime. This architecture enables a powerful set of features all within the streaming flow:

  • Contextual analytics
  • Continuous metric computation
  • Dynamic event filtering
  • Real-time enrichment
  • Intelligent alerting

Example: In industrial IoT, Joule can monitor thousands of sensor streams, continuously compute rolling averages and anomaly scores in real time using DuckDB, and trigger alerts the moment thresholds are breached all within a single pipeline, without data duplication or delay.

By combining streaming and analytical workloads in one process, Joule delivers low-latency insights and maximizes operational efficiency.

Ambition of the integration

Modern systems ranging from applications, mobile technologies, IoT platforms and financial services continuously generate high-volume and high velocity data streams. These streams must be ingested, processed, analysed and surfaced in real time to support critical business functions and decision making.

Traditional databases fall short in delivering real-time insights. Their store-then-query model creates delays that can cost businesses agility, revenue, and competitive advantage. Today’s data demands a new model, one that combines streaming and analytics in a single, continuous flow to power faster decisions and smarter operations.

The Joule DuckDB integration bridges this gap by enabling real-time, contextual insights as data arrives, eliminating the need for external infrastructure or intermediate storage. The result is a lightweight, high-performance solution suitable for edge deployments, low-latency pipelines and operational analytics at scale.

Application enablement

Applying contextual analytics with pre-computed metrics and reference data within a single use case dramatically improves the accuracy, relevance, and timeliness of insights. This unified approach unlocks several key advantages:

  • Timely and context-aware analytics, ensuring that decisions reflect the current state of the system or environment.
  • Dynamic alerting, where thresholds and triggers adapt based on real-time context rather than static rules.
  • Deeper understanding of data dynamics, capturing trends, anomalies and correlations that emerge only when multiple data modalities are analysed together.

This fusion of capabilities allows systems to respond more intelligently to change making analytics not just faster but more contextually meaningful.

Introducing DuckDB: The In-memory analytics database

DuckDB is an open-source, in-process analytical database optimsed for speed and efficiency. Its in-memory, columnar execution engine is purpose-built for OLAP workloads, delivering blazingly fast performance.

Lightweight and embeddable, DuckDB integrates seamlessly into applications without the overhead of external database infrastructure. It also excels at processing large datasets efficiently by directly querying files stored on local disks or remote file systems such as Amazon S3 without requiring a dedicated database server.

Key DuckDB features Joule leverages:

  • Standard SQL dialect: Provides a familiar SQL syntax for defining metrics, advanced analytics and data access
  • Embeddable: Lightweight, highly configurable, in-memory column storage, and highly efficient data analytical processing
  • Durable: Ability to provide initial state and reference data on process startup

Joule: The unified analytics engine

Joule is a low-code streaming analytics engine designed to generate real-time, context-aware alerts, inferences, and insights by processing high-velocity event data. At its core, Joule uses a powerful event analytics architecture that supports complex temporal logic, dynamic event filtering and continuous metric computation.

With Joule, users can define rich analytical behavior through a flexible, low-code use case definition language. This abstraction makes it easy to express window based computations, conditional logic, stateful event correlation and threshold-based triggers without writing extensive boilerplate code.

Key capabilities include:

  • Real-time contextual alerting: Alerts adapt to live conditions using historical and current data context.
  • Event enrichment and transformation: Raw data streams can be enhanced with reference data or derived attributes.
  • Continuous computation of metrics: Metrics such as rolling averages, counts, or percentiles are maintained over time windows.
  • Dynamic, declarative use case definitions: Complex logic is captured in a human-readable, maintainable format.

By combining these capabilities, Joule streamlines the development and deployment of streaming analytics across use cases like predictive maintenance, anomaly detection, fraud prevention, and real-time personalisation all without the complexity of traditional stream processing frameworks.

How They Work Together

DuckDB is embedded directly within the Joule runtime, enabling in-process in-memory analytical computation. Database tables can be initialized through startup data imports or incrementally populated by live event streams, allowing seamless integration between real-time data processing and analytical querying.

This tight integration combines DuckDB’s analytical performance with Joule’s real-time processing engine, allowing users to benefit from both speed and simplicity while building powerful streaming analytics applications.

Key capabilities unlocked by this integration include:

  • Continuous metrics: Users can define real-time, continuously updated metrics (“ticking metrics”) that feed directly into live analytic pipelines or decision logic.
  • Contextual data access: Static or slowly changing reference data can be joined with live streams for enrichment, filtering, transformation and rule-based alerting enabling richer, more accurate analytics.
  • Telemetry and auditing: Event level telemetry, system metrics and model evaluation outputs can be captured and stored for retrospective auditing and monitoring.
  • Flexible data access: Processed data can be accessed via a dynamic REST API or exported in industry-standard formats such as Parquet or CSV for external analysis or archival.

Continuous metrics

Staying informed with near real-time business metrics allows organisations to make swift, decisive actions that enhance success. Joule’s SQL-compliant metrics engine facilitates this by generating real-time analytics and storing metrics through SQL expressions, supporting KPIs, alerting and predictive insights.

metrics engine:
  runtime policy:
    frequency: 1
    startup delay: 2
    time unit: MINUTES

  foreach metric compute:
    metrics:
      - name: BidMovingAverage
          metric key: symbol
          table definition: standardQuoteAnalyticsStream.BidMovingAverage 
                           (symbol VARCHAR, avg_bid_min FLOAT, 
                            avg_bid_avg FLOAT,avg_bid_max FLOAT)
          query:
            SELECT symbol,
            MIN(bid) AS 'avg_bid_min',
            AVG(bid) AS 'avg_bid_avg',
            MAX(bid) AS 'avg_bid_max'
            FROM standardQuoteAnalyticsStream.quote
            WHERE
            ingestTime >= date_trunc('minutes',now() - INTERVAL 2 MINUTES) AND ingestTime <= date_trunc('minutes',now())
            GROUP BY symbol
            ORDER BY 1;
          truncate on start: true
          compaction policy:
            frequency: 8
            time unit: HOURS

For more information on continuous metrics go to the Joule documentation.

Contextual Data Access

For Joule applications, contextual (or reference) data is crucial for enabling advanced and insightful stream processing. By seamlessly integrating contextual data with real-time events, the system delivers enriched processing outcomes and better informed insights.

Within Joule, contextual data comprises both static and dynamic data:

  • Static data — Includes reference information, such as customer contracts, product SKUs, or start-of-day FX rates, machine learning predictive models etc.

  • Dynamic data — Consists of real-time metrics that drive KPI computations, machine learning (ML) engineered features, analytics and other performance-based measures.

    enricher: fields: quote_metrics: by metric family: BidMovingAverage by key: symbol with values: [avg_bid_min, avg_bid_avg, avg_bid_max] using: MetricsDB

This example applies the latest metrics for a given symbol to an event. For more information go to the Joule Documentation.

Telemetry and Auditing

Joule provides a built in telemetry auditing feature, which records inbound and outbound events to support testing, model validation, retraining and other analyses.

The auditing process logs events into an in-memory SQL database at set intervals, allowing users to capture both raw incoming events and processed outgoing events.

stream:
  ...
  telemetry auditing:
    raw:
      clone events: true
      frequency: 10
    processed:
      clone events: false
      frequency: 10

Flexible Data Access

Data can be stored within the Joule process and can be exported as Parquet files for further analytics use cases.

Joule provides a dynamic OpenAPI to query and export data stored within the in-memory DuckDB database.

Also, Parquet files can be imported into the Joule process to drive user-defined functionality. This feature is enabled by the DuckDB import and export file functionality.

stream:
  ...
  data import:
    parquet:
      - schema: exchange_rates
        table: fxrates
        asView: false
        files: [ 'fxrates.parquet' ]
        drop table: true
        index:
          fields: [ 'ccy' ]
          unique: false
          
      - schema: reference_data
        table: us_holidays    
        asView: true
        files: [ 'holidays.parquet' ]
        drop table: true
        
      - schema: metrics
        table: bid_moving_averages
        files: ['data/parquet/mvavgs-prime.parquet']
        drop table: true
        index:
          fields: [ 'symbol' ]
          unique: false

Summary

DuckDB and Joule come together to deliver high-performance analytics and real-time streaming in one lightweight unified platform. Engineered for speed and simplicity, this solution enables smarter insights precisely when you need them.

This tightly integrated design unlocks a powerful set of capabilities:

  • Advanced analytics enablement
  • Continuous metrics computations
  • Dynamic enrichment and filtering using contextual data
  • Efficient, flexible data storage and retrieval

The result is a powerful scalable solution for organisations that need to act on live data, respond rapidly to change and extract maximum value from every event in motion. With its tightly integrated architecture, Joule performs sophisticated, context-aware analytics on high-velocity data delivering low-latency insights without the overhead of separate components or complex orchestration.

Whether you’re building a real-time fraud detection system, a personalized recommendation engine, or an operational control platform. The Joule + DuckDB integration offers a robust, adaptable foundation for modern, context-driven streaming analytics.

Getting started

We have tailored a tutorial for anyone who wants to learn how to use Joule. The tutorial covers how to set up your environment, walk through building a use case and the deploy process.

The tutorial can be found here within the Joule documentation.

We’re here to help

Feedback is most welcome including thoughts on how to improve and extend Joule and ideas for exciting use cases.

Join the FractalWorks Community Forum, who openly share ideas, best practices and support each other. Feel free to join us there! And if you have any further questions on how to become a partner or customer of FractalWorks, do not hesitate to engage with us, we will be happy to talk about your needs.

Back to Blog

Related Posts

View All Posts »
Joule Release 1.0.4

Joule Release 1.0.4

This early release brings a number of new features, bug fixes, optimisations and general usability enhancements.