Federated in a single SELECT
Join an Iceberg fact table, an OLTP dimension, and an S3 event stream without ETL. One query, one result set.
Loading DXData
Query Engine
Run ANSI SQL across Iceberg, Postgres, Snowflake, S3, and 100+ sources without moving a byte. Sub-second cache hits. Full time-travel.
Join an Iceberg fact table, an OLTP dimension, and an S3 event stream without ETL. One query, one result set.
Adaptive result, partition, and query-plan caches turn repeat dashboards into 23-millisecond reads.
Query Iceberg snapshots and Nessie branches by timestamp or commit hash — auditable, reproducible, reversible.
One query, three engines, one result set.
// federation
The query planner inspects every catalog in the FROM clause, pushes predicates and projections into each source adapter, and lets native engines handle the work they are best at — Postgres indexes, Iceberg partition pruning, Snowflake columnar scans.
Results stream back into a single coordinator that stitches joins, applies group-by, and emits one result set. No copy step, no staging bucket, no lag window between the systems you already run and the answers your team needs.
// caching
DXData memoizes work at every tier of the planner. The result cache replays identical queries in a few milliseconds, the partition cache short-circuits scans on stable segments, and the plan cache skips parsing and optimization on hot templates.
Every layer is keyed by dataset, invalidated on writes, and tunable per workspace — so overnight refreshes and live dashboards can share the same engine without fighting each other.
// time-travel
Iceberg tracks an immutable history of every commit to every table, and Nessie layers git-style branches on top of that history. The query engine speaks both — point at a snapshot hash, a wall-clock timestamp, or a named branch and the planner will resolve the exact file set.
Reproduce a bug from last Tuesday, preview a schema change on a branch, or audit which exact rows powered a regulatory report — all through SQL, no infrastructure gymnastics.
// sql.compat
Every query you have already written — including window functions, CTEs, recursive queries, and grouping sets — runs unmodified. The engine reports the same standard error codes and planner hints as Trino.
On top of ANSI, DXData ships opinionated extensions for the work analysts actually do: MATCH_RECOGNIZE for sessionization, a geospatial toolbox, window-frame exclusions, and array and map UDFs that compile to efficient vectorized operators.
// benchmarks
// how it works
Every query traverses the same deterministic pipeline — the difference between 23 milliseconds and 2 seconds is which stages can short-circuit on cached work.
ANSI SQL compiled into a validated, typed logical plan with source-aware identifiers.
Cost-based optimizer reorders joins, pushes predicates down, and prunes partitions.
Physical plan splits work into stages across federated source adapters and workers.
Stages stream rows in parallel with adaptive parallelism and workload isolation.
Results, partitions, and plans are memoized per-dataset with automatic invalidation.
// connectors
Twelve of the most common sources below — the full catalog spans 100+ native connectors across databases, warehouses, SaaS tools, object stores, and streams.
// use cases
Back Looker, Tableau, and Superset with cached results that stay fresh through automatic invalidation.
SELECT region, SUM(revenue) FROM mart.orders GROUP BY 1;Explore raw events and production tables side-by-side without waiting on ingestion or modeling cycles.
SELECT * FROM postgres.app.users
WHERE email LIKE '%@acme.com' LIMIT 100;Embed low-latency SQL directly inside product features — search, personalization, billing rollups.
SELECT COUNT(*) FROM events
WHERE user_id = :id AND ts > NOW() - INTERVAL 1 DAY;// faq
// related capabilities
Ready when you are
Point the query engine at your warehouse and lake — no migration, no ingestion, no scheduled maintenance window required.