ETL is the bullied kid of the data stack. Dashboard wrong? Blame the ETL engineer. Patient count drops 8% on a Monday? ETL. Cost-sharing numbers shift overnight and the CFO wants answers by 9 AM? ETL again. The Slack thread always lands on the same person — who almost certainly isn't the one who introduced the lie.
Let's look at one of those lies. A patient was reclassified from a Bronze insurance tier to a Gold tier in March, but the claims report has been grouping her Bronze claims under the Gold bucket ever since. The ETL job that loaded the new tier overwrote the old value. Every historical encounter that happened under Bronze coverage now looks like Gold.
This is what a missing SCD Type 2 decision looks like in production. It's almost never caught by data quality tests, because every row is individually valid. The lie is in the history, not the data. And it didn't originate in the ETL code where it eventually surfaces — it was introduced six months earlier, at the schema-design layer, when someone made an SCD typing decision (or silently failed to) and never wrote it down anywhere outside of a MERGE statement.
I've been designing data warehouse dimension layers long enough to have strong opinions here. The SCD taxonomy is one of the most decision-dense areas of dimensional modeling — eight types with overlapping tradeoffs, no universal agreement on when to use what, and performance consequences that bite teams six to twelve months after they think they've done it right. This post is my attempt to cut through all of it.