Skip to main content

TalkingSchema + dbt Workflow

TalkingSchema and dbt serve different but complementary roles in the analytics engineering pipeline. Understanding exactly where each tool fits — and what each tool does not do — is the starting point for using them together effectively.

TalkingSchema: Designs the data warehouse schema. Answers what tables should exist, how they should be structured, and what the relationships between them are. Generates ERDs, dimensional models, SQL DDL, and ORM exports from plain-language requirements.

dbt: Transforms data inside the warehouse. Answers how raw source data becomes analytics-ready mart data. Manages SQL SELECT model dependencies, runs data quality tests, and generates documentation.

Neither tool replaces the other. A data warehouse project needs both: a designed structure (TalkingSchema) and a transformation layer to populate it (dbt).

note

TalkingSchema does not run dbt, connect to your dbt project, or manage dbt configurations. The workflow described here is code-based: TalkingSchema generates SQL code blocks and schema definitions that you copy into your dbt project. A deeper integration (e.g., pushing models directly to a repository) is on the roadmap.


How the Tools Fit Together

Business Requirements

[TalkingSchema — Design]
• Star schema ERD
• Fact and dimension table DDL
• dbt model SQL (via AI chat)

[dbt — Transform]
• staging/ models
• intermediate/ models
• marts/ models

Analytics-Ready Data Warehouse

The design work (TalkingSchema) must precede the transformation work (dbt) — you cannot write transformation models for tables that have not been designed. In practice, these two activities often happen iteratively: you design an initial mart layer structure in TalkingSchema, write initial dbt models, then return to TalkingSchema as requirements evolve.


Worked Example: E-Commerce Orders

This example walks through the full workflow for an e-commerce company's order analytics use case.

Step 1 — Design the data warehouse schema in TalkingSchema

Start by describing the business domain and desired analytical capability in TalkingSchema's chat interface:

Prompt:

"I'm building a data warehouse for an e-commerce company. The primary business process is order line item revenue analysis. Design a Kimball-style star schema with:

  • A fact table at the order line item grain
  • dim_date (conformed, with fiscal year support)
  • dim_customers (SCD Type 2, track changes to city and customer_segment)
  • dim_products (include product_category and brand hierarchy)
  • dim_promotions (nullable — not every order has a promotion)

Use surrogate keys (SERIAL) on all dimension tables. Use BIGSERIAL on the fact table."

TalkingSchema generates the full star schema:

fct_order_line_items

CREATE TABLE fct_order_line_items (
order_line_sk BIGSERIAL PRIMARY KEY,
order_date_sk INTEGER NOT NULL REFERENCES dim_date(date_sk),
customer_sk INTEGER NOT NULL REFERENCES dim_customers(customer_sk),
product_sk INTEGER NOT NULL REFERENCES dim_products(product_sk),
promotion_sk INTEGER REFERENCES dim_promotions(promotion_sk),
-- Source keys (degenerate dimensions)
order_id VARCHAR(50) NOT NULL,
line_item_id VARCHAR(50) NOT NULL,
-- Measures
quantity INTEGER NOT NULL,
unit_price NUMERIC(10, 2) NOT NULL,
discount_amount NUMERIC(10, 2) NOT NULL DEFAULT 0,
extended_price NUMERIC(10, 2) NOT NULL,
-- Metadata
_loaded_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

dim_customers (SCD Type 2)

CREATE TABLE dim_customers (
customer_sk SERIAL PRIMARY KEY,
customer_bk VARCHAR(50) NOT NULL,
full_name VARCHAR(255),
email VARCHAR(255),
city VARCHAR(100),
country VARCHAR(100),
customer_segment VARCHAR(50),
effective_date DATE NOT NULL,
expiry_date DATE,
is_current BOOLEAN NOT NULL DEFAULT TRUE,
row_version INTEGER NOT NULL DEFAULT 1
);

Apply the DDL to your warehouse (Snowflake, BigQuery, Redshift, PostgreSQL) to create the target tables.

Step 2 — Design the source staging layer

Before mart models can run, dbt needs staging models that clean and standardize raw source data. Use TalkingSchema's AI to understand your source schema:

Prompt:

"I have a raw orders table from my Shopify export. It has these columns: id, created_at (string in ISO 8601), customer_id, customer_email, customer_city, product_id, product_title, sku, quantity (string), price (string), discount (string or null), promotion_code (nullable string). Design a clean staging layer schema for this data."

TalkingSchema generates a stg_orders staging table definition with correct data types and a clean column naming convention — before you write a single line of dbt SQL.

Step 3 — Generate dbt staging model SQL

With the source and target schemas understood, generate the dbt model SQL in TalkingSchema's chat:

Prompt:

"Generate a dbt staging model for the raw Shopify orders table described above. Requirements:

  • Use snake_case column names
  • Cast all columns explicitly to correct types (dates to DATE, prices to NUMERIC(10,2), quantity to INTEGER)
  • Add a surrogate key using dbt_utils.generate_surrogate_key(['id', 'line_item_id'])
  • Add a _loaded_at column with current_timestamp
  • Handle the NULL discount column with COALESCE to 0
  • Source name: shopify, identifier: raw_orders"

TalkingSchema returns a complete SQL model in a code block:

-- models/staging/stg_orders.sql
with source as (
select * from {{ source('shopify', 'raw_orders') }}
),
renamed as (
select
id::varchar as order_id,
created_at::timestamp as created_at,
created_at::date as order_date,
customer_id::varchar as customer_bk,
customer_email::varchar as customer_email,
lower(trim(customer_city))::varchar as customer_city,
product_id::varchar as product_bk,
product_title::varchar as product_name,
sku::varchar as product_sku,
quantity::integer as quantity,
price::numeric(10, 2) as unit_price,
coalesce(nullif(discount, '')::numeric(10, 2), 0.00) as discount_amount,
(price::numeric(10, 2) * quantity::integer)
- coalesce(nullif(discount, '')::numeric(10, 2), 0.00) as extended_price,
promotion_code::varchar as promotion_code,
{{ dbt_utils.generate_surrogate_key(['id']) }} as order_sk,
current_timestamp as _loaded_at
from source
)
select * from renamed

Copy this into models/staging/stg_orders.sql in your dbt project.

Step 4 — Generate the mart layer dbt models

With staging models in place, generate the fact table population model:

Prompt:

"Generate a dbt mart model for fct_order_line_items that joins stg_orders to the dimension tables using surrogate key lookups. The fact table grain is one row per order. Include left joins to dim_promotions (nullable). Use ref() for all upstream model references and source() for raw tables."

-- models/marts/fct_order_line_items.sql
with orders as (
select * from {{ ref('stg_orders') }}
),
dim_date as (
select * from {{ ref('dim_date') }}
),
dim_customers as (
select * from {{ ref('dim_customers') }}
where is_current = true
),
dim_products as (
select * from {{ ref('dim_products') }}
),
dim_promotions as (
select * from {{ ref('dim_promotions') }}
)
select
{{ dbt_utils.generate_surrogate_key(['o.order_id']) }} as order_line_sk,
d.date_sk as order_date_sk,
c.customer_sk,
p.product_sk,
promo.promotion_sk,
o.order_id,
o.quantity,
o.unit_price,
o.discount_amount,
o.extended_price,
current_timestamp as _loaded_at
from orders o
left join dim_date d on o.order_date = d.calendar_date
left join dim_customers c on o.customer_bk = c.customer_bk
left join dim_products p on o.product_bk = p.product_bk
left join dim_promotions promo on o.promotion_code = promo.promotion_code

Step 5 — Generate dbt schema.yml with tests

Prompt:

"Generate a dbt schema.yml for fct_order_line_items with column descriptions and data quality tests. Add not_null on all surrogate key columns, unique on order_line_sk, and accepted_values tests on any status columns."

TalkingSchema returns a YAML file ready for your dbt project with tests configured.


How TalkingSchema Output Maps to dbt Project Structure

dbt project/
├── models/
│ ├── staging/ ← TalkingSchema: stg_ model SQL from AI chat
│ │ ├── stg_orders.sql
│ │ ├── stg_customers.sql
│ │ └── _sources.yml
│ ├── intermediate/ ← TalkingSchema: int_ model SQL (complex joins)
│ │ └── int_orders_with_promotions.sql
│ └── marts/ ← TalkingSchema: fact and dim DDL + mart model SQL
│ ├── fct_order_line_items.sql
│ ├── dim_customers.sql
│ └── schema.yml
├── snapshots/ ← TalkingSchema: SCD Type 2 snapshot config
│ └── snap_customers.sql
└── dbt_project.yml

TalkingSchema contributes:

  • The DDL for mart tables — the CREATE TABLE statements you apply to your warehouse
  • The staging model SQL — generated in the chat interface, copied to models/staging/
  • The mart model SQL — generated in the chat interface, copied to models/marts/
  • The schema YAML — generated in the chat interface for documentation and testing
  • The ERD — visual reference of the full star schema throughout development

dbt owns:

  • The execution environment (dbt run, dbt test, dbt build)
  • The dependency graph (lineage DAG)
  • The connection profiles to the warehouse
  • The snapshot logic for SCD Type 2 population

Common AI Prompts for dbt Workflows in TalkingSchema

GoalExample prompt
Design mart layer schema"Design a star schema for [business process]. Grain: [one row per X]. Dimensions: [list]."
Generate staging model"Generate a dbt staging model for [table]. Cast all columns explicitly. Add surrogate key with dbt_utils."
Generate SCD Type 2 snapshot"Generate a dbt snapshot configuration for dim_customers tracking changes to city and customer_segment."
Generate schema.yml with tests"Generate a dbt schema.yml for [model] with not_null, unique, and relationship tests."
Generate intermediate model"Generate a dbt intermediate model that joins stg_orders to stg_promotions on promotion_code."
Generate mart model with all joins"Generate fct_order_line_items as a dbt mart model joining to all dimension tables via surrogate key lookups."
Refactor existing schema"Review the current fct_orders schema. Is the grain consistent? Are there missing dimension joins?"

Important Distinctions

TalkingSchema generates SQL, not a dbt project. The SQL that TalkingSchema generates in the chat interface is a code block. You copy it into your dbt project. TalkingSchema does not write files to disk, push to Git, or run dbt commands.

dbt's AI Copilot is different from TalkingSchema. dbt Cloud includes an AI assistant for writing transformation SQL within the dbt Cloud IDE. This is a different tool for a different purpose: dbt's AI helps you write dbt models faster; TalkingSchema's AI designs the schema structure, dimensional model, and DDL.

TalkingSchema is not a dbt orchestrator. It does not schedule dbt runs, manage environments, or handle CI/CD for your dbt project. For orchestration, use dbt Cloud, Airflow, Prefect, or another orchestration tool.


Frequently Asked Questions

Can TalkingSchema generate dbt models?

Yes. TalkingSchema's AI copilot can generate dbt model SQL — staging models, intermediate models, mart models, snapshot configurations, and schema YAML files — through the chat interface. The generated SQL follows dbt conventions: ref(), source(), dbt_utils macros, snake_case naming, and explicit type casts. The output is a code block you copy into your dbt project.

Does TalkingSchema integrate directly with dbt Cloud?

Not yet. There is no direct API integration with dbt Cloud or dbt Core in the current product. The workflow is copy-paste: generate SQL in TalkingSchema's chat, copy it to your dbt project. A deeper integration is on the roadmap.

What is the best workflow for using TalkingSchema with dbt on a new project?

  1. Start in TalkingSchema: describe the business domain, design the mart layer dimensional model, export the DDL
  2. Create the target tables in your warehouse using the DDL
  3. In TalkingSchema's chat, generate the dbt staging model SQL for each source table
  4. Copy the staging models to your dbt project and run dbt run --select staging
  5. In TalkingSchema's chat, generate the mart model SQL for each fact and dimension table
  6. Copy the mart models, run dbt build, and verify with dbt test
  7. Return to TalkingSchema when requirements change — iterate on the schema and regenerate affected models

Can TalkingSchema help me design a dbt project structure?

Yes. Ask TalkingSchema's AI copilot to recommend a dbt project structure for your specific use case: "I'm building a dbt project for a SaaS company with three business processes: subscription events, feature usage, and support tickets. Recommend a dbt project structure with staging, intermediate, and mart layers, and identify the conformed dimensions." The AI will suggest a full project structure following dbt best practices.