Skip to main content

Kimball vs Inmon vs Data Vault 2.0: Data Warehouse Architecture Guide

Every data team eventually walks into the same room.

Someone is convinced Kimball is dead and cloud storage fixes everything. Someone else is just as convinced that Inmon is the only serious enterprise architecture. A third voice slides in with "Data Vault 2.0 — hubs, links, satellites, audit-ready, future-proof." Forty minutes later nothing has been decided, six weeks later nothing has been built, and the dashboards the business asked for in Q1 are now a Q3 problem.

Here's my promise: this post will help you pick an approach the way an architect picks a structure—not like a sports fan picking a jersey.

The reason this argument runs for six weeks

The design phase has been broken for two decades. Joe Reis's State of Data Engineering survey of 1,101 practitioners earlier this year puts 89% in active pain with their modeling approach — and the reasons are always the same three: no time, no clear ownership, tools that punish anyone who tries.

So nobody prototypes the three options against a real source estate inside a sprint. The choice gets made on whoever's loudest at the whiteboard — and that's how warehouses end up over-engineered for governance nobody needed, or under-engineered for the audit trail everyone will need next year.

The solution isn't more conviction. It's the ability to stand up all three architectures against the same source landscape, look at the seams, and pick the one that survives the constraints you actually live with. That's the work TalkingSchema was built for — not pretty diagrams, but compressing the design phase from a multi-month exercise into one collaborative conversation with canvases you can interrogate before a single dbt model gets written.

The company we're going to walk through

Imagine a mid-market commerce business with 30 operational tables across eight subsystems:

  • CRM + catalog own identity, B2B accounts, customer addresses, support cases, products, SKUs, categories, and suppliers.
  • Sales + billing + finance own carts, orders, payments, invoices, refunds, chargebacks, and adjustments.
  • Fulfillment + inventory + marketing own warehouses, carriers, shipments, stock movement, reservations, campaigns, promotions, and attribution.

That is the real problem: every team owns a slice of the truth, and no two slices define "customer" or "order" the same way.

So the architecture conversation has to start here — with the operational reality the warehouse must absorb, not a whiteboard sketch or an end-state mart.

Source schema

Source Schema — E-commerce OLTP

Spend a moment with the ERD. Read the DBML notes. Notice which subsystem owns money, which owns identity, and which only knows half of any given customer's story. Every methodology decision downstream is a different answer to: how do we glue these eight subsystems into one coherent business?

What we're going to do next

Every analytical platform has to do the same three jobs — capture what happened, integrate it into shared meaning, serve it to humans and machines. The “methods” differ mostly in which job they optimize first. We're going to walk this exact source estate through three architectures and then converge on the mart layer they all eventually feed.

Architecture deep dives

The deep dives below are meant to be read as one sequence. Each page takes the same source estate through a different architectural instinct, uses TalkingSchema to make the model visible, and asks the practical questions you would want answered before committing to the design.

Architecture Deep Dives