SQL to dbt guide: Slowly Changing Dimensions with dbt Snapshots

Time travel for your data. Tracking how campaigns and customers evolve over time.

Mar 23, 2026

Hi there! Alejandro here 😊
Suscribe if you like to read about technical data & AI learnings, deep dives!
Enjoy the reading and let me know in the comments what you think about it 👨🏻‍💻

📝 TL;DR

SCD2 tracks how data changes over time instead of just keeping the latest version. Use it when historical context drives business decisions
Two snapshot strategies: timestamp captures all changes (campaign metrics that update daily), check tracks specific column changes (visitor segment transitions)
Snapshots come with real costs: storage growth, query complexity, and maintenance overhead

If you want to get more context on the dbt project use for these articles:
SQL to dbt guide: Your dbt Starter Pack Project
Alejandro Aboy
·
October 4, 2025
Read full story
SQL to dbt guide: How Data Layers Flow With Medallion
Alejandro Aboy
·
November 10, 2025
Read full story
SQL to dbt guide: Hands-on Data Quality & Integrity Worfklows
Alejandro Aboy
·
November 24, 2025
Read full story

Most data teams reach for SCD2 patterns without asking the fundamental question: do you actually need historical tracking?

In the sql-to-dbt-series repo, there are two clear use cases where the answer is yes.

Campaign performance evolution: ROAS, conversion rates, and budget utilization change daily. Marketing managers need performance trends over time, not just today’s snapshot.
Visitor segment transitions: Customers evolve from prospects to VIP buyers. Understanding these lifecycle transitions helps predict behavior and optimize targeting.

If your analysis only cares about current state, save yourself the complexity.

Two Snapshot Strategies

dbt snapshots detect changes in two ways. Your choice depends on what triggers historical tracking.

Timestamp: When Records Change

`snap_campaign_performance.sql` uses the timestamp strategy. Every time the source data updates, dbt captures a new historical record:

Timestamp strategy works when you have a reliable `updated_at` column and want to capture every change. Campaign metrics move daily. This is the right fit.

Thanks for reading The Pipe & The Line! This post is public so feel free to share it.

Check: When Specific Columns Change

snap_visitor_segments.sql uses the check strategy. dbt only creates a new record when specific columns actually change:

Check strategy is the right call when you care about specific attribute transitions, not every data update.

If a visitor’s page view count increases but their segment stays the same, no new snapshot record.

The strategy-to-use-case mapping is the core lesson: timestamp for metrics that change frequently and where every change matters, check for categorical shifts where you’re tracking transitions.

Querying Historical Data

Snapshots create three special columns: dbt_valid_from (when this version became active), dbt_valid_to (when it was superseded, NULL for current), and dbt_updated_at (when dbt processed it).

Current state - get the latest version of each record:

select *
from snapshots.snap_campaign_performance
where dbt_valid_to is null

Point-in-time - what did the data look like on a specific date:

select *
from snapshots.snap_campaign_performance
where ‘2024-01-15’ >= dbt_valid_from
and (’2024-01-15’ < dbt_valid_to or dbt_valid_to is null)

Trend analysis - track changes over time with window functions:

select
    campaign_id,
    campaign_name,
    dbt_valid_from,
    roas,
    roas_tier,
    lag(roas) over (partition by campaign_id order by dbt_valid_from) as previous_roas
from snapshots.snap_campaign_performance
order by campaign_id, dbt_valid_from

These three patterns cover most analytical needs.

Current state for dashboards, point-in-time for compliance, trend analysis for optimization decisions.

The Costs and When to Skip It

SCD2 is not free. Before snapshotting everything, understand the trade-offs:

Storage: A campaign that changes daily creates 365 records per year. Multiply by thousands of campaigns.
Query complexity: Historical queries require date filters and window functions. Not every analyst is comfortable with these.
Processing time: Every `dbt snapshot` run compares current data against all historical records.
Maintenance: Snapshot schemas need monitoring. The repo includes a `snapshots/schema.yml` with column descriptions and `not_null` tests on key identifiers. Treat snapshot documentation the same as any other model.

You probably don’t need SCD2 if your analysis only looks at current state, historical changes happen infrequently, or storage costs outweigh analytical value.

You definitely need SCD2 if compliance requires audit trails, business decisions depend on trend analysis, or customer lifecycle analysis drives strategy.

Mitigation helps: use invalidate_hard_deletes=true to handle deleted source records, consider archiving old snapshots after defined retention periods, and create mart models that pre-calculat common historical analyses.

If you enjoyed the content, hit the like ❤️ button, share, comment, repost, and all those nice things people do when like stuff these days. Glad to know you made it to this part!

The Pipe & The Line

SQL to dbt guide: Your dbt Starter Pack Project

SQL to dbt guide: How Data Layers Flow With Medallion

SQL to dbt guide: Hands-on Data Quality & Integrity Worfklows

Discussion about this post

Ready for more?