Federated query is one of those features that demos beautifully and then disappoints in production. We have been through two major planner rewrites to fix that. Here is what stuck.
Push down aggressively, then verify
Our old planner tried to be clever about which predicates were safe to push down into a remote source. Our new planner pushes everything and then verifies the result against a locally evaluated sample. If the two disagree, we fall back to the unpushed plan. The verification step is cheap and catches the dialect edge cases that used to be unfixable in the planner.
One coordinator per source
Running a single coordinator across all sources made for nice topology diagrams and terrible tail latencies. We now spin up a per-source coordinator with its own connection pool and its own adaptive timeout.
Caching is not optional
We cache the result of every remote scan for up to 60 seconds, keyed on (source, filter, columns). In steady-state BI workloads that simple rule moves cache hit rate above 80% and takes most of the wall-clock cost out of the federated path.
Written by
Kai Lindstrom
Engineering, Pipelines at DXData.