OLTP vs OLAP: Schema Design, CDC, DuckDB, and the Decision Framework
It was a Tuesday afternoon when the CTO sent the Slack message that every data engineer dreads:
"The claims dashboard has been loading for 47 seconds. Our customers are complaining. What happened?"
What happened was straightforward. A data analyst ran a reasonable aggregation — GROUP BY claim_type, EXTRACT(year FROM incident_date) — against production PostgreSQL: the same cluster that was already handling ~3,200 concurrent insurance claims from adjusters across eight states.
The query locked an index, response times spiked, and on-call was paged at 2 AM.
What failed wasn't the schema or the SQL. It was the assumption: that a system tuned for thousands of short, correct transactions per minute could also scan years of history for analytics without hurting live operations.
That is the OLTP vs OLAP problem in one outage. It is not abstract — it costs engineering hours, customer trust, and real infrastructure budget — and it usually begins with a schema decision from before analytics was on the roadmap.
If your analytics team is running queries on your production OLTP database, you have a problem — and a deadline. The question isn't whether to separate them. It's how soon.
OLTP and OLAP are different jobs, not just different speeds
The terms come from a 1993 paper by E.F. Codd (who also gave us the relational model) and were formalized into practical architecture by Jim Gray and Michael Stonebraker over the following decade. Stonebraker's core argument — articulated repeatedly from his 1990 Postgres papers through his 2007 VLDB keynote — was that the database industry's obsession with "one size fits all" systems was architecturally incoherent. You cannot optimize the same storage engine for point lookups and full table scans. The access patterns are fundamentally opposed.
OLTP (Online Transaction Processing) systems exist to record and mutate the current state of a business. An insurance system creates a claim, updates its status from reported to investigating, logs a fraud flag, processes a payment. Every operation touches one row or a small set of related rows. The system needs to commit thousands of these transactions per minute with ACID guarantees. Latency is measured in milliseconds. Correctness under concurrency is non-negotiable.
OLAP (Online Analytical Processing) systems exist to answer questions about the history of that state. How many claims were denied in Q3 across high-risk policyholders in Florida? What's the loss ratio for commercial property policies underwritten in the past two years? What's the average days-to-settle for claims flagged for fraud vs. clean claims?
How many claims were denied in Q3 for high-risk policyholders in Florida?
What's the loss ratio on commercial property written in the last two years?
For fraud-tagged claims vs. clean claims, what's the average days-to-settle?
Those questions imply millions of rows, joins across multiple tables, and aggregations that may run for seconds. That can be acceptable, because nobody is asking a CFO dashboard to repaint in sub-100ms.
The fundamental problem: the access patterns that make OLTP fast (B-tree indexes on primary keys, row-oriented storage, write-optimized pages) are the exact same patterns that make OLAP slow. And vice versa.
Why the storage model matters more than you think
▭ Row-oriented: one seek, the whole row
PostgreSQL, MySQL, and SQL Server store rows contiguously on disk. When an adjuster opens a claim, one read can pull the entire record — all eight fields — in a single I/O:
id, policy_id, claimant_id, incident_date, claim_type, status, estimated_amount_cents, adjuster_id
That is the OLTP happy path: most workflows want the full row.
▸ The same query pays for columns you never selected
The analyst's aggregation looks innocent enough:
SELECT
claim_type,
EXTRACT(YEAR FROM incident_date),
COUNT(*),
SUM(estimated_amount_cents)
FROM claims
GROUP BY 1, 2;
…but the storage layout does not know that. To compute the result, PostgreSQL still walks every row of claims. Because data is row-major, the engine effectively materializes all eight columns even though the expression only needed three. At fifty million rows and ~200 bytes per row, that is on the order of 10 GB of I/O to answer a question whose useful payload is closer to ~1.2 GB.
▐ Columnar: read what you group and sum
Snowflake, BigQuery, ClickHouse, and DuckDB store columns as separate runs. The same query touches three columns instead of eight — often ~50× less I/O before compression even enters the conversation. Add run-length encoding (for example, status = 'approved' repeated four million times collapsing to almost nothing) and vectorized execution (SIMD batches), and a workload that freezes an OLTP cluster for 47 seconds can finish in seconds in an engine built for scans.
This is not a query tuning problem. You cannot index your way out of it. It's a storage architecture problem, and it starts with schema design.
How schema shape changes between OLTP and OLAP
The schema design principles for OLTP and OLAP systems aren't just different — they're in direct tension. What makes an OLTP schema excellent (normalization, referential integrity, write-optimized joins) actively harms OLAP performance. What makes an OLAP schema excellent (denormalization, pre-joined wide tables, surrogate key indirection) would be an antipattern in OLTP.
OLTP: normalize to protect the data
▸ Why Third Normal Form still matters
OLTP schemas lean on normalization — usually Third Normal Form — for a blunt reason: any fact should live in exactly one row.
Denormalize agent_name onto policies, and renaming an agent fans out into millions of row updates, row locks, and the risk of a partial write that leaves two policies disagreeing about who sold them. Keep an agent_id foreign key instead, and the same business event is one update in agents.
◇ TalkingSchema — transactional OLTP model
The insurance OLTP model below is 10 tables. Foreign keys make relationships explicit — you cannot attach a claim to a policy that does not exist. The claims row stays narrow: it stores policy_id, not the policyholder’s age or region. Those attributes stay in their own tables and arrive through joins when you ask for them.
▸ When the workload is exactly what you modeled for
That shape is what transactional traffic needs. An adjuster nudging a claim from reported to investigating costs one row update in claims and one insert in claim_events. A new claim row fails fast if the policies row is missing. Decrementing adjusters.workload_count can remain atomic. Every path stays surgical.
▸ When the question is not what you modeled for
What was the average days-to-settle for commercial property claims in the Northeast, broken down by agent tier, for the past three years?
Answering that from the OLTP layout means reconstructing context across the graph:
claims → policies → policyholders (region — e.g. Northeast)
policies → agents (agent tier; agency relationship matters here)
claims → claim_payments (when the claim actually settled)
claims → fraud_flags (whether fraud work changed the timeline)
That is eight tables, three years of history in the filter, and aggregation on top. On PostgreSQL at ~50 million claims, plan for roughly two to five minutes of wall time — and a run that pulls the same I/O bandwidth your adjusters need for live work.
OLAP: denormalize for the query
▸ Kimball's split: facts vs dimensions
The warehouse version of the same business looks radically different on purpose. The Kimball lineage — The Data Warehouse Toolkit (1996), then two more editions of hard-won field practice — boils down to one layout rule: put measurements in fact tables and put descriptive context in dimension tables. Facts are what you sum and count; dimensions are how you slice the sums.
The win is operational: an analytical question stops being “re-walk the OLTP foreign-key graph” and becomes “scan one fact grain, join to a small set of dimensions that are already aligned to reporting language.”
◇ TalkingSchema — dimensional warehouse model
▸ Bake expensive truth into the fact
days_to_settle and had_fraud_flag can land on fact_claims during ETL, while the OLTP system still stores settlement and fraud as separate tables that only meet in a join. At query time, the dashboard reads columns, not a detective story through claim_payments and fraud_flags.
▸ Track history on purpose (SCD Type 2)
dim_policyholder.risk_segment can be versioned — you see the segment as of the claim date, not whatever happened to be true after last night's batch job rewrote a row. The OLTP system usually exposes current state only; pricing and reserving need the history of state.
▸ Surrogate keys for a messy world
dim_agent.agent_key is an ID the warehouse assigns so joins stay reliable. The agent ID from your OLTP app (agents.id) is not safe to depend on: a system migration can renumber it, and after an acquisition you can end up with two different agents that both used to be id = 7 in their own database. A surrogate key sidesteps all of that — it is just a stable warehouse-side handle that facts and dimensions can keep pointing at.
▸ Skip the lookup table when it does not earn its keep
Not every category needs its own dimension table. claim_type only has a handful of values (auto, home, commercial…) and nothing extra to describe — no hierarchy, no labels, no metadata. So we keep it on fact_claims as a plain string instead of building a dim_claim_type table you would have to join every time. Rule of thumb: few values + nothing to describe = leave it on the fact table.
▸ The same executive question, a different query plan
The “loss ratio by region and agent tier” story from earlier collapses into a single fact table plus dimensions that were loaded to answer exactly those labels:
SELECT
dph.region,
da.tier,
dp.product_type,
COUNT(*) AS claim_count,
SUM(fc.paid_amount_cents) AS total_paid,
SUM(fp.premium_amount_cents) AS total_earned_premium,
ROUND(SUM(fc.paid_amount_cents)::numeric / NULLIF(SUM(fp.premium_amount_cents), 0), 4) AS loss_ratio
FROM insurance_dw.fact_claims fc
JOIN insurance_dw.dim_policyholder dph ON fc.policyholder_key = dph.policyholder_key AND dph.is_current
JOIN insurance_dw.dim_agent da ON fc.agent_key = da.agent_key AND da.is_current
JOIN insurance_dw.dim_product dp ON fc.product_key = dp.product_key AND dp.is_current
JOIN insurance_dw.fact_premiums fp ON fc.policy_key = fp.policy_key
JOIN insurance_dw.dim_date dd ON fc.date_reported_key = dd.date_key
WHERE dd.year BETWEEN 2023 AND 2025
GROUP BY dph.region, da.tier, dp.product_type
ORDER BY loss_ratio DESC;
On Snowflake or BigQuery — columnar storage, predicate pushdown, and no OLTP concurrency fighting for the same row buffers — this shape is built to finish in seconds (often under 10 at the ~50M-claim scale). The normalized OLTP plan for the same business question still means minutes and shared infrastructure with adjusters trying to work.
The hidden cost of running analytics on your operational database
The 47-second query was not a one-time event. It's the predictable result of a pattern that most engineering teams repeat before they understand why the separation exists.
Here's what actually happens when analytics workloads hit OLTP databases:
Write amplification from indexes. OLTP databases need indexes on many columns for fast point lookups. Every index is a write target: an insert into claims must also update the B-tree index on policy_id, the index on adjuster_id, the index on status, and the index on incident_date. At 10,000 inserts per minute, each claiming 4–6 index writes, that's 50,000–60,000 index updates per minute just from inserts. Running a full-table analytical scan while this is happening doesn't merely "slow things down" — it causes lock contention, buffer pool eviction (your working set gets replaced with analytical scan pages), and WAL pressure that can delay replication.
The read replica trap. Many teams respond to this by routing analytics queries to a PostgreSQL read replica. This works until it doesn't: the replica runs the same row-oriented storage as the primary; the same 47-second query runs in 47 seconds. Worse, a long-running analytical query on a replica can cause replication lag — the replica falls behind the primary because it's too busy scanning to apply WAL events. Now your analytics are reading stale data and your primary is under replication pressure.
Cost explosion at scale. On cloud OLAP platforms, running an unoptimized analytical query against an improperly structured schema is a direct billing event. BigQuery charges $6.25 per TB scanned on-demand. A query that scans an entire unnormalized claims table (500 columns, 50M rows) instead of a star schema fact_claims (30 columns, pre-filtered) could scan 10–50x more data per run. If your analysts run 200 queries per day, the difference between a well-designed warehouse schema and a "just dump everything in BigQuery" approach can be $5,000–$20,000 per month in query costs alone.
The real cost is opportunity cost. Every analytics query that hurts production is a query your data team couldn't run, a report that went unbuilt, a risk signal that wasn't surfaced. Insurance companies that can't run actuarial models quickly enough to price policies correctly don't just lose money on individual claims — they misprice entire risk pools and lose tens of millions in adverse selection. Regulators in most US jurisdictions require actuarial models to be auditable. If your analytics infrastructure can't reliably reproduce point-in-time historical snapshots, that's not a technical debt problem — it's a compliance problem.
Change data capture: the pragmatic bridge between OLTP and OLAP
Accepting that OLTP and OLAP need separate systems immediately raises an engineering question: how does data get from one to the other?
The naive answer is batch ETL: a cron job that runs nightly, selects all claims modified since the last run, and loads them into the warehouse. This works at small scale and was the industry standard through the 2000s. It has one fatal flaw: your warehouse data is always at least 8–24 hours stale. For insurance claims processing, where fraud detection depends on real-time pattern matching and regulatory reporting has strict timing requirements, this is often unacceptable.
Change data capture (CDC) via Debezium is the most production-proven bridge between OLTP and OLAP. It's not glamorous, but it works at any scale.
Change Data Capture (CDC) reads the database transaction log — not the tables themselves — to stream every insert, update, and delete as it happens. Debezium is the dominant open-source implementation, running as a Kafka Connect source connector that reads PostgreSQL's Write-Ahead Log (WAL). Each transaction appears as a JSON event on a Kafka topic within milliseconds of commit.
The bridge schema captures this:
The Debezium 2.x stream for an insurance claim status change looks like this:
{
"op": "u",
"ts_ms": 1747140825000,
"source": { "table": "claims", "lsn": "0/1A2B3C4D" },
"before": { "id": 182741, "status": "investigating", "adjuster_id": 44 },
"after": {
"id": 182741,
"status": "approved",
"adjuster_id": 44,
"updated_at": "2026-05-13T09:27:05Z"
}
}
⟿ What happens after the event lands
A stream consumer — a Flink job, a dbt streaming model, or a Kafka Connect sink — picks up the event and does three small jobs:
- Looks up the warehouse dimension keys (
policyholder_key,agent_key,product_key) for the IDs in the event. - Builds the fact row (
fact_claims) in the warehouse-friendly shape, with surrogate keys already resolved. - Upserts that row, so the next dashboard refresh sees the new status.
Net effect: the warehouse is seconds behind production instead of 8–24 hours.
⇄ Why this is the real win — the two schemas stop fighting
CDC lets the OLTP schema stay normalized (good for transactions) and the warehouse schema stay denormalized (good for analytics). The reshape happens in the stream processor in the middle, not in either database.
That means the two sides can move at their own pace. Adding a column to claims in OLTP does not force an immediate redesign of fact_claims — you extend the Debezium event, decide when the warehouse cares, and roll it through the transformation when ready.
▸ If you do not want to operate Kafka
You do not have to. Managed CDC services — Airbyte, Fivetran, Stitch — give you the same OLTP-to-warehouse pipeline without running Kafka Connect yourself. The tradeoff is higher latency (minutes, not seconds) in exchange for far less operational surface area.
HTAP: the promise and the reality in 2026
Every few years, a vendor announces that the OLTP/OLAP separation is a false dichotomy — that a single system can handle both workloads without compromise. This is the HTAP (Hybrid Transactional/Analytical Processing) promise, and in 2026 it's partially true.
▸ TiDB — two engines under one query layer
TiDB pairs a row-oriented store (TiKV) for OLTP with a columnar replica (TiFlash) for analytics. Writes go to TiKV; TiFlash catches up within a few hundred milliseconds; the query planner picks the right engine per query.
- What it does well: real OLTP throughput (~52K TPS in published benchmarks) and analytical scans that would melt PostgreSQL.
- The catch: you are operating two storage engines plus the routing layer that ties them together. The architecture is elegant — the ops surface is not.
▸ SingleStore — one engine, two representations
SingleStore keeps the same data in both rowstore and columnstore form inside a single engine, and lets the planner choose per query.
- What it does well: genuinely fast on mixed workloads — think live dashboards that need fresh transactional data and historical aggregates in the same view.
- The catch: proprietary and expensive. For batch actuarial work that runs overnight on years of history, you are paying for a real-time capability you do not use.
▸ CockroachDB — not actually HTAP
CockroachDB shows up in HTAP conversations, but it is fundamentally a distributed OLTP database — strong consistency (SERIALIZABLE by default, which most HTAP systems quietly give up), no columnar engine. Treat its analytics as PostgreSQL-class, not Snowflake-class, and it stops being confusing.
HTAP makes sense for one specific workload profile: operational analytics where freshness matters more than scale. Real-time fraud detection scoring is an HTAP use case. Quarterly actuarial modeling of 10 years of claim history is not. Most insurance companies need both — which means HTAP serves one of them and you still need a warehouse for the other.
Don't choose HTAP because it sounds like you only need one system. Choose it because your specific workload genuinely requires sub-second freshness on analytical queries, and you're willing to pay the operational premium.
DuckDB and the rise of embedded analytics
There is a third option the OLTP/OLAP binary misses — and it only became production-viable in 2024–2025.
⊕ What DuckDB is
DuckDB is an in-process analytical database: embed it in your app (or run the CLI), point it at data, run SQL. Columnar storage, vectorized execution, SIMD. In October 2025 its in-memory variant ranked #1 on ClickBench, beating ClickHouse on several query classes.
↗ Why it matters for OLTP/OLAP
DuckDB's postgres scanner can read your PostgreSQL tables directly. Run it as a separate process, point it at a read replica, and analytical queries execute in DuckDB's columnar engine — not in PostgreSQL — so they do not fight OLTP transactions for buffer pool or WAL.
≡ The pattern that works under ~100 GB
For most Series A–B startups, this combination replaces a warehouse, a CDC pipeline, a Kafka cluster, and a Snowflake bill:
- PostgreSQL holds operational data (OLTP, as it should).
- DuckDB runs analytical queries against a Postgres read replica via the
postgresscanner. - For older history, DuckDB reads Parquet files in S3 alongside the live Postgres data.
MotherDuck sells the managed version of this for teams that do not want to self-host.
↑ When to graduate
DuckDB is single-node; full-table scans get slow somewhere around 500 GB–1 TB of transactional data. That is the cue to build the proper Debezium + Kafka Connect pipeline into a columnar warehouse.
Designing the OLTP-to-OLAP pipeline with TalkingSchema
The insurance schema above — both the operational OLTP tables and the dimensional warehouse — was designed in a single TalkingSchema workspace.
The design process mirrors how a data architect would approach it in practice:
Step 1: Model or Import the OLTP schema first. The operational domain comes first because it defines what the business actually does. In TalkingSchema:
"Design the OLTP claims processing schema for an insurance company. Include policyholders, policies, products, agents, underwriters, claims with a state machine event log, claim payments, adjusters, and fraud flagging. Use 3NF normalization throughout. All monetary amounts should be stored in cents as bigint."
The ERD shows tables with proper foreign key relationships, enum annotations, and indexed columns.
Step 2: Design the dimensional counterpart. Once the OLTP schema is clear, the warehouse schema can be derived:
"Now design the dimensional warehouse counterpart for claims analytics. Create fact_claims and fact_premiums as the central fact tables. Dimensions should include policyholder (SCD Type 2, with age_band and risk_segment derived attributes), policy, product, agent (with tier attribute derived from commission_rate bands), date, and claim_status. Pre-compute days_to_settle and had_fraud_flag in the fact table to eliminate runtime joins."
TalkingSchema generates the warehouse schema as a second schema group (insurance_dw.*) on the same canvas. The two schema groups coexist — OLTP on the left, warehouse on the right, CDC bridge tables between them.
Step 3: Export both schemas for deployment. The OLTP schema exports as PostgreSQL DDL for the operational database. The warehouse schema exports as Snowflake or Redshift DDL (the warehouse syntax differs: BIGINT IDENTITY instead of BIGSERIAL, distribution keys on policy_key, sort keys on date_reported_key).
The decision framework: when one database is fine, and when it isn't
Do you actually need a separate analytics system? The answer depends on four variables: data volume, query complexity, freshness requirements, and organizational maturity.
| Dimension | Stay on OLTP | DuckDB Middle Path | Separate OLAP |
|---|---|---|---|
| Data volume | < 50GB active data | 50GB–500GB | > 500GB or multi-year history |
| Query complexity | Simple aggregations, < 5 tables | Moderate complexity | Multi-table star joins, window functions, percentiles |
| Freshness requirement | Real-time (< 1s) is required | Near-real-time (< 60s) via CDC | Batch OK (minutes to hours) |
| Team maturity | Small team, no data engineer | One data-savvy engineer | Dedicated data engineering team |
| Query frequency | < 50 analytical queries/day | 50–200 queries/day | > 200 queries/day from BI tools |
| Regulatory reporting | None or simple | Simple compliance | Complex (actuarial, compliance) |
✓ Stay on OLTP (with read replicas)
You are a ~10-person team without a dedicated data engineer. “Analytics” means a Metabase board on PostgreSQL: on the order of five queries per day against ~2 GB of active data, mostly simple GROUP BY work that finishes in ~200 ms.
Building out CDC, Kafka, and Snowflake at that scale burns budget and roadmap out of proportion to the upside. Use pg_stat_statements to see what is actually slow, point dashboards at read replicas, and revisit once you have ~12 months more growth — or sooner if load spikes.
→ Split OLTP / OLAP
You are closer to ~40 engineers. Leadership wants daily views of loss ratio by segment, claim funnel conversion, and agent performance. Operations is clearing roughly 50,000 claims per month, and BI’s ad hoc SQL has already paged on-call.
That profile calls for ELT into a columnar warehouse, not sharper tuning on Postgres alone. Compared to another quarter of replica lag and cleanup on the OLTP cluster, a Snowflake MEDIUM warehouse (~$8/hr, usually paused between analyst sessions) often ends up the cheaper line item.
◆ The inflection point
If a heavy analytics query has ever caused a production page — slow transactions, replication falling behind on WAL, or OOM on the OLTP tier — you have proof that workload does not belong on the transactional database. The next failure mode is usually uglier: ship the pipeline before round two.
‡ Regulated insurance (extra bar)
In many jurisdictions, actuarial models must be auditable and reproducible: analysts need history as-of a declared date, not only what the OLTP row shows today. Making every dimension versioned inside the transactional database (full SCD Type 2 wherever you slice) will choke the workloads that pay the bills. For regulated carriers, a disciplined warehouse schema is compliance infrastructure, not a side project layered on dashboards.
The moment an analytics query causes a production page, you've crossed the threshold. Build the pipeline before the next incident — not after it.
The schemas are different because the jobs are different
Stonebraker's argument from 2007 still holds — there is no single database architecture that keeps both blistering transactional lookups and heavy analytical scans genuinely happy. Layout, indexing, and schema semantics pull in incompatible directions because the workloads ask for opposite things.
The bridge between OLTP and OLAP is cheaper and steadier today than ever. Debezium-style CDC fits commodity Kubernetes. DuckDB and MotherDuck let small teams postpone a standalone warehouse bill. Medallion layers (Bronze → Silver → Gold) give the transformation spine a naming scheme everyone can reuse. HTAP engines like TiDB cover the narrow slices where mixing engines is the actual requirement.
Design the OLTP schema for the transactions. Design the warehouse schema for the questions. Build the pipeline between them. Don't wait for another 47-second prod query to argue the case for you.
Want to design your own OLTP + OLAP dual schema? → Try TalkingSchema — describe your domain, and get a full ERD with both schemas modeled and exportable to SQL in minutes.