TalkingSchema + dbt Workflow

TalkingSchema and dbt serve different but complementary roles in the analytics engineering pipeline. Understanding exactly where each tool fits — and what each tool does not do — is the starting point for using them together effectively.

TalkingSchema: Designs the data warehouse schema. Answers what tables should exist, how they should be structured, and what the relationships between them are. Generates ERDs, dimensional models, SQL DDL, and ORM exports from plain-language requirements.

dbt: Transforms data inside the warehouse. Answers how raw source data becomes analytics-ready mart data. Manages SQL SELECT model dependencies, runs data quality tests, and generates documentation.

Neither tool replaces the other. A data warehouse project needs both: a designed structure (TalkingSchema) and a transformation layer to populate it (dbt).

note

TalkingSchema does not run dbt, connect to your dbt project, or manage dbt configurations. The workflow described here is code-based: TalkingSchema generates SQL code blocks and schema definitions that you copy into your dbt project. A deeper integration (e.g., pushing models directly to a repository) is on the roadmap.

How the Tools Fit Together

Business Requirements
        ↓
[TalkingSchema — Design]
  • Star schema ERD
  • Fact and dimension table DDL
  • dbt model SQL (via AI chat)
        ↓
[dbt — Transform]
  • staging/ models
  • intermediate/ models
  • marts/ models
        ↓
Analytics-Ready Data Warehouse

The design work (TalkingSchema) must precede the transformation work (dbt) — you cannot write transformation models for tables that have not been designed. In practice, these two activities often happen iteratively: you design an initial mart layer structure in TalkingSchema, write initial dbt models, then return to TalkingSchema as requirements evolve.

Worked Example: E-Commerce Orders

This example walks through the full workflow for an e-commerce company's order analytics use case.

Step 1 — Design the data warehouse schema in TalkingSchema

Start by describing the business domain and desired analytical capability in TalkingSchema's chat interface:

Prompt:

"I'm building a data warehouse for an e-commerce company. The primary business process is order line item revenue analysis. Design a Kimball-style star schema with:

A fact table at the order line item grain

dim_date (conformed, with fiscal year support)

dim_customers (SCD Type 2, track changes to city and customer_segment)

dim_products (include product_category and brand hierarchy)

dim_promotions (nullable — not every order has a promotion)

Use surrogate keys (SERIAL) on all dimension tables. Use BIGSERIAL on the fact table."

TalkingSchema generates the full star schema:

fct_order_line_items

CREATE TABLE fct_order_line_items (
    order_line_sk      BIGSERIAL PRIMARY KEY,
    order_date_sk      INTEGER NOT NULL REFERENCES dim_date(date_sk),
    customer_sk        INTEGER NOT NULL REFERENCES dim_customers(customer_sk),
    product_sk         INTEGER NOT NULL REFERENCES dim_products(product_sk),
    promotion_sk       INTEGER REFERENCES dim_promotions(promotion_sk),
    -- Source keys (degenerate dimensions)
    order_id           VARCHAR(50) NOT NULL,
    line_item_id       VARCHAR(50) NOT NULL,
    -- Measures
    quantity           INTEGER NOT NULL,
    unit_price         NUMERIC(10, 2) NOT NULL,
    discount_amount    NUMERIC(10, 2) NOT NULL DEFAULT 0,
    extended_price     NUMERIC(10, 2) NOT NULL,
    -- Metadata
    _loaded_at         TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

dim_customers (SCD Type 2)

CREATE TABLE dim_customers (
    customer_sk        SERIAL PRIMARY KEY,
    customer_bk        VARCHAR(50) NOT NULL,
    full_name          VARCHAR(255),
    email              VARCHAR(255),
    city               VARCHAR(100),
    country            VARCHAR(100),
    customer_segment   VARCHAR(50),
    effective_date     DATE NOT NULL,
    expiry_date        DATE,
    is_current         BOOLEAN NOT NULL DEFAULT TRUE,
    row_version        INTEGER NOT NULL DEFAULT 1
);

Apply the DDL to your warehouse (Snowflake, BigQuery, Redshift, PostgreSQL) to create the target tables.

Step 2 — Design the source staging layer

Before mart models can run, dbt needs staging models that clean and standardize raw source data. Use TalkingSchema's AI to understand your source schema:

Prompt:

"I have a raw orders table from my Shopify export. It has these columns: id, created_at (string in ISO 8601), customer_id, customer_email, customer_city, product_id, product_title, sku, quantity (string), price (string), discount (string or null), promotion_code (nullable string). Design a clean staging layer schema for this data."

TalkingSchema generates a stg_orders staging table definition with correct data types and a clean column naming convention — before you write a single line of dbt SQL.

Step 3 — Generate dbt staging model SQL

With the source and target schemas understood, generate the dbt model SQL in TalkingSchema's chat:

Prompt:

"Generate a dbt staging model for the raw Shopify orders table described above. Requirements:

Use snake_case column names

Cast all columns explicitly to correct types (dates to DATE, prices to NUMERIC(10,2), quantity to INTEGER)

Add a surrogate key using dbt_utils.generate_surrogate_key(['id', 'line_item_id'])

Add a _loaded_at column with current_timestamp

Handle the NULL discount column with COALESCE to 0

Source name: shopify, identifier: raw_orders"

TalkingSchema returns a complete SQL model in a code block:

-- models/staging/stg_orders.sql
with source as (
    select * from {{ source('shopify', 'raw_orders') }}
),
renamed as (
    select
        id::varchar                                                    as order_id,
        created_at::timestamp                                         as created_at,
        created_at::date                                              as order_date,
        customer_id::varchar                                          as customer_bk,
        customer_email::varchar                                       as customer_email,
        lower(trim(customer_city))::varchar                           as customer_city,
        product_id::varchar                                           as product_bk,
        product_title::varchar                                        as product_name,
        sku::varchar                                                  as product_sku,
        quantity::integer                                             as quantity,
        price::numeric(10, 2)                                         as unit_price,
        coalesce(nullif(discount, '')::numeric(10, 2), 0.00)         as discount_amount,
        (price::numeric(10, 2) * quantity::integer)
            - coalesce(nullif(discount, '')::numeric(10, 2), 0.00)   as extended_price,
        promotion_code::varchar                                       as promotion_code,
        {{ dbt_utils.generate_surrogate_key(['id']) }}                as order_sk,
        current_timestamp                                             as _loaded_at
    from source
)
select * from renamed

Copy this into models/staging/stg_orders.sql in your dbt project.

Step 4 — Generate the mart layer dbt models

With staging models in place, generate the fact table population model:

Prompt:

"Generate a dbt mart model for fct_order_line_items that joins stg_orders to the dimension tables using surrogate key lookups. The fact table grain is one row per order. Include left joins to dim_promotions (nullable). Use ref() for all upstream model references and source() for raw tables."

-- models/marts/fct_order_line_items.sql
with orders as (
    select * from {{ ref('stg_orders') }}
),
dim_date as (
    select * from {{ ref('dim_date') }}
),
dim_customers as (
    select * from {{ ref('dim_customers') }}
    where is_current = true
),
dim_products as (
    select * from {{ ref('dim_products') }}
),
dim_promotions as (
    select * from {{ ref('dim_promotions') }}
)
select
    {{ dbt_utils.generate_surrogate_key(['o.order_id']) }} as order_line_sk,
    d.date_sk          as order_date_sk,
    c.customer_sk,
    p.product_sk,
    promo.promotion_sk,
    o.order_id,
    o.quantity,
    o.unit_price,
    o.discount_amount,
    o.extended_price,
    current_timestamp  as _loaded_at
from orders o
left join dim_date d         on o.order_date = d.calendar_date
left join dim_customers c    on o.customer_bk = c.customer_bk
left join dim_products p     on o.product_bk = p.product_bk
left join dim_promotions promo on o.promotion_code = promo.promotion_code

Step 5 — Generate dbt schema.yml with tests

Prompt:

"Generate a dbt schema.yml for fct_order_line_items with column descriptions and data quality tests. Add not_null on all surrogate key columns, unique on order_line_sk, and accepted_values tests on any status columns."

TalkingSchema returns a YAML file ready for your dbt project with tests configured.

How TalkingSchema Output Maps to dbt Project Structure

dbt project/
├── models/
│   ├── staging/              ← TalkingSchema: stg_ model SQL from AI chat
│   │   ├── stg_orders.sql
│   │   ├── stg_customers.sql
│   │   └── _sources.yml
│   ├── intermediate/         ← TalkingSchema: int_ model SQL (complex joins)
│   │   └── int_orders_with_promotions.sql
│   └── marts/                ← TalkingSchema: fact and dim DDL + mart model SQL
│       ├── fct_order_line_items.sql
│       ├── dim_customers.sql
│       └── schema.yml
├── snapshots/                ← TalkingSchema: SCD Type 2 snapshot config
│   └── snap_customers.sql
└── dbt_project.yml

TalkingSchema contributes:

The DDL for mart tables — the CREATE TABLE statements you apply to your warehouse
The staging model SQL — generated in the chat interface, copied to models/staging/
The mart model SQL — generated in the chat interface, copied to models/marts/
The schema YAML — generated in the chat interface for documentation and testing
The ERD — visual reference of the full star schema throughout development

dbt owns:

The execution environment (dbt run, dbt test, dbt build)
The dependency graph (lineage DAG)
The connection profiles to the warehouse
The snapshot logic for SCD Type 2 population

Common AI Prompts for dbt Workflows in TalkingSchema

Goal	Example prompt
Design mart layer schema	"Design a star schema for [business process]. Grain: [one row per X]. Dimensions: [list]."
Generate staging model	"Generate a dbt staging model for [table]. Cast all columns explicitly. Add surrogate key with dbt_utils."
Generate SCD Type 2 snapshot	"Generate a dbt snapshot configuration for dim_customers tracking changes to city and customer_segment."
Generate schema.yml with tests	"Generate a dbt schema.yml for [model] with not_null, unique, and relationship tests."
Generate intermediate model	"Generate a dbt intermediate model that joins stg_orders to stg_promotions on promotion_code."
Generate mart model with all joins	"Generate fct_order_line_items as a dbt mart model joining to all dimension tables via surrogate key lookups."
Refactor existing schema	"Review the current fct_orders schema. Is the grain consistent? Are there missing dimension joins?"

Important Distinctions

TalkingSchema generates SQL, not a dbt project. The SQL that TalkingSchema generates in the chat interface is a code block. You copy it into your dbt project. TalkingSchema does not write files to disk, push to Git, or run dbt commands.

dbt's AI Assistant is different from TalkingSchema. dbt Cloud includes an AI assistant for writing transformation SQL within the dbt Cloud IDE. This is a different tool for a different purpose: dbt's AI helps you write dbt models faster; TalkingSchema's AI designs the schema structure, dimensional model, and DDL.

TalkingSchema is not a dbt orchestrator. It does not schedule dbt runs, manage environments, or handle CI/CD for your dbt project. For orchestration, use dbt Cloud, Airflow, Prefect, or another orchestration tool.

Frequently Asked Questions

Can TalkingSchema generate dbt models?

Yes. TalkingSchema's AI assistant can generate dbt model SQL — staging models, intermediate models, mart models, snapshot configurations, and schema YAML files — through the chat interface. The generated SQL follows dbt conventions: ref(), source(), dbt_utils macros, snake_case naming, and explicit type casts. The output is a code block you copy into your dbt project.

Does TalkingSchema integrate directly with dbt Cloud?

Not yet. There is no direct API integration with dbt Cloud or dbt Core in the current product. The workflow is copy-paste: generate SQL in TalkingSchema's chat, copy it to your dbt project. A deeper integration is on the roadmap.

What is the best workflow for using TalkingSchema with dbt on a new project?

Start in TalkingSchema: describe the business domain, design the mart layer dimensional model, export the DDL
Create the target tables in your warehouse using the DDL
In TalkingSchema's chat, generate the dbt staging model SQL for each source table
Copy the staging models to your dbt project and run dbt run --select staging
In TalkingSchema's chat, generate the mart model SQL for each fact and dimension table
Copy the mart models, run dbt build, and verify with dbt test
Return to TalkingSchema when requirements change — iterate on the schema and regenerate affected models

Can TalkingSchema help me design a dbt project structure?

Yes. Ask TalkingSchema's AI assistant to recommend a dbt project structure for your specific use case: "I'm building a dbt project for a SaaS company with three business processes: subscription events, feature usage, and support tickets. Recommend a dbt project structure with staging, intermediate, and mart layers, and identify the conformed dimensions." The AI will suggest a full project structure following dbt best practices.

How the Tools Fit Together​

Worked Example: E-Commerce Orders​

Step 1 — Design the data warehouse schema in TalkingSchema​

Step 2 — Design the source staging layer​

Step 3 — Generate dbt staging model SQL​

Step 4 — Generate the mart layer dbt models​

Step 5 — Generate dbt schema.yml with tests​

How TalkingSchema Output Maps to dbt Project Structure​

Common AI Prompts for dbt Workflows in TalkingSchema​

Important Distinctions​

Frequently Asked Questions​

Can TalkingSchema generate dbt models?​

Does TalkingSchema integrate directly with dbt Cloud?​

What is the best workflow for using TalkingSchema with dbt on a new project?​

Can TalkingSchema help me design a dbt project structure?​

How the Tools Fit Together

Worked Example: E-Commerce Orders

Step 1 — Design the data warehouse schema in TalkingSchema

Step 2 — Design the source staging layer

Step 3 — Generate dbt staging model SQL

Step 4 — Generate the mart layer dbt models

Step 5 — Generate dbt schema.yml with tests

How TalkingSchema Output Maps to dbt Project Structure

Common AI Prompts for dbt Workflows in TalkingSchema

Important Distinctions

Frequently Asked Questions

Can TalkingSchema generate dbt models?

Does TalkingSchema integrate directly with dbt Cloud?

What is the best workflow for using TalkingSchema with dbt on a new project?

Can TalkingSchema help me design a dbt project structure?