From zero to a federated query
You'll install the CLI, authenticate, register a data source, run a query that joins your warehouse with an external Postgres, and create a branch — all in under ten minutes.
Install the CLI
The DXData CLI is a single static binary with no runtime dependencies. The installer detects your platform and drops the binary into /usr/local/bin (or the equivalent on Windows).
Prefer a package manager? We publish formulas for Homebrew, apt, and winget. See the CLI reference for alternate install paths.
# macOS / Linuxcurl -fsSL https://get.dxdata.io | sh # Then verifydxdata --versionAuthenticate
dxdata loginkicks off an OIDC device-code flow. You'll see a short code in your terminal and a browser window opens to your workspace's login page. Paste the code, approve the session, and you're done.
Tokens are stored in your OS keychain — never on disk as plaintext — and refreshed automatically. CI/CD environments should use workspace API keys instead; see the API authentication guide.
dxdata login# Opens a browser window.# Paste the short code shown in your terminal.Connect your first data source
DXData federates queries across external systems. Describe a source once, grant the necessary read credentials, and the engine's optimizer handles predicate pushdown and parallel scans for you.
Save the YAML above and apply it with dxdata apply ~/.dxdata/sources/analytics-pg.yaml. Sources live in your workspace config — colleagues with access will see them too.
# ~/.dxdata/sources/analytics-pg.yamlkind: sourcename: analytics-pgtype: postgresqlhost: analytics.internalport: 5432database: warehouseauth: secret: pg-analytics-roRun your first query
Every identifier in DXData starts with a catalog. lake is your native Iceberg catalog; the source you just registered is queryable as analytics-pg. Joins across them Just Work — no extracts, no replication.
Run it with dxdata query --file query.sql or paste it into the Worksheets UI. The planner will show you which sub-query runs where.
-- Your first cross-source querySELECT c.region, COUNT(*) AS events_todayFROM lake.events eJOIN analytics-pg.public.customers c ON c.id = e.customer_idWHERE e.ts >= CURRENT_DATEGROUP BY c.regionORDER BY events_today DESC;Create a data branch
Branches in DXData are powered by Nessie and work like Git branches for your catalog. Every commit, table create, and schema change is a reversible operation scoped to a named branch.
This is the foundation of every safe migration, dbt-style dev workflow, and incident recovery flow you'll build on top of DXData. Read more in Core concepts.
# Create a branch off maindxdata branch create exp/region-cohort --from main # Run a destructive-looking migration safelydxdata query --branch exp/region-cohort \ --sql "CREATE TABLE lake.region_cohort AS ..." # Merge when you are happydxdata branch merge exp/region-cohort --into main