dbt

dbt master

Posted by chochang on Tue, Jul 22, 2025

101

what and why

dbt embraces modular SQL (relates to [[Data Model for Modularity]]) and considers analytics code, transformations code as data assets.

what: dbt is a transformation workflow to

  • modularize and centralize analytics code
  • collaborate on data models: versioning, testing, documenting data models, transformations queries

features:

  • most important features
    • compile sql files with jinja into pure SQL script, the determine the order of executions, no DML, DDL required
  • other features:
    • document your model and its fields
    • test your model
    • manage packages
    • load seeds file (static data)
    • snapshot data for a point in time

dbt projects

dbt enforces high-level folder structure. the following items are must have in a dbt project:

  1. dbt_project.yml file (project configuration files)
  2. models directory
  3. snapshots directory

(1) and (2) contains dbt resources.

resource configs and properties

==reference doc==: Resource configs and properties

what: (in the most cases, generally true)

  • properties describe resources
  • configurations control how dbt builds these resources in the warehouse

config / properties file include:

  • dbt_project.yml
  • profiles.yml <– for dbt core user only, contains the information dbt needs to connect to data platform
    • often one profile for each warehouse in use (most organizations only have one profile)
  • properties.yml <– to declare properties for dbt resources

building your DAGs in dbt

refer to [[dbt - build your DAG]]

references

childrens notes

1LIST
2FROM #programming_tools/dbt
3SORT file.mtime DESC

Organize and structure dbt project + naming convention:

  • [[dbt style guide]]

External resources:

  1. Awesome public dbt projects

official documents: docs.getdbt.com

using dbt during data modeling task:

refactoring SQL for modularity course:

Sample project: Running data pipeline with bigquery and dbt

useful packages

  1. dbt-metalog = https://medium.com/indiciumtech/dbt-metalog-your-metadatas-catalog-for-dbt-32eed2234b0e
  2. dbt-utils = macros that can be (re)used across dbt projects
  3. dbt-expectations is an awesome package to create great expectations of your dbt project.

It’s worth mentioning that we didn’t define any airflow operator representing the dbt model on our own, we used the dbt-airflow-factory which automatically translates your dbt project to airflow tasks and supports the gateway between the staging and presentation layer.
→ More at: Build modern data platform in 4 months for volt.io

learn dbt custom macros

to get started with dbt:

dbt + static websites in GCP

dbt tips and tricks