Living Up to the Contract: Data Governance with dbt Model Contracts

Model configurations are a .yml configuration that can serve a fairly important use case of determining rules for the dbt model being produced and halting a dbt pipeline when one of these rules is broken - stopping incomplete data from being loaded into a model with constraints or stopping a model being updated to a different shape/schema that could break downstream uses of the dbt model.

Find out what exactly they are, why they are distinct from tests, why you might want to use them and how to get them set up.

What is a dbt model contract? What makes them different to tests?

A dbt model contract sets rules on a dbt model that will cause a job to error when the rules are broken. They are configured in .yml files and a guide on how to implement them follows later in the blog.

They are distinct from testing in two key ways:

  1. Rather than testing the content of a model like a data test does, or testing logic with mocked values like a unit test. Model contracts test the shape of the model and whether it conforms to the expectations set.
  2. Whether a model meets a contract is checked before the model is built and data is added to the model whereas a data test by virtue of testing the content of a model occurs after the model has been built and data has been loaded (unit tests are a special case here given that the data is mocked this occurs before the model is materialized).

So to recap model contracts check the shape of a model rather than the content and can be used to ensure that if a model produces data not in the expected shape it is never loaded into the model in the database.

When I refer to model shape, there are 3 key factors column: name, datatype and any additional constraints to be applied to a column (that it must be non_null).

When defining a contract on a model, the expectation is that all the columns specified in the contract are present and no other columns are surfaced by the model and that these columns have the expected data type and satisfy any constraint rules that can be enforced on the data platform.

💡 A great feature about model contracts that is sometimes missed is that assuming the model returns all and only the columns expected then their defined order in the model is inconsequential as they will be reordered in accordance with the contract anyway.

Model contract constraints have varying levels of support depending on the platform - in most cases a constraint can be defined (i.e. specified in the .yml but it is not necessarily enforceable). Refer to the matrices in the model-contrracts documentation for confirmation that a constraint can be defined and/or enforced.

Why use a dbt model contract?

With dbt models being SELECT statements and dbt handling the DDL & DML, it can be easy to change a model and iterate - this is ideal in a development environment but potentially problematic in a production environment where other people and processes might build upon one of your dbt models. If a model is used in a downstream process where a particular shape is required, a contract is an effective measure to ensure that downstream uses of a model are not compromised.

These contracts are increasingly relevant and important with the rise of dbt Mesh as a feature of dbt Cloud where models can be used as sources for other dbt projects.

Contracts are also relevant for incremental models when schemas change, but require a setting of append_new_columns or fail to avoid schema drift between the model and contract.

How can I implement a dbt model contract?

Model contracts are set in .yml files, it makes sense to add them to a single file that also includes any other metadata associated with the model like descriptions and any generic tests that are being applied to models.

version:2

models:
  - name: [<model-name>]
    description: description of the model for documentation purposes
    tests: <model-level tests>
    config:
      contract:
        enforced: true
      materialized: table #not supported for materialized view and ephemeral, less supported for views.
    columns:
      - name: id_field
        description:>
        column-level description, primary key for table so includes a not_null constraint and a unique data_test
        tests:
          - unique
        data_type: int
        constraints:
          - type: not_null
      - name: name_field
        data_type: string
      ...

This contract dictates that the above model can only have id_field and name_field and will ensure that no matter the SQL model order the materialization will always have the columns in that order. The contract will be broken if one of these fields is removed, a field is added, one of the datatypes is changed or if the id_field has a null value in it (platform dependent).

The rules of the contract are the columns specified and any additional compatible properties (datatype and constraint). But the contract configuration is set within the config property of the model and stating that it is enforced (the same place you might specify a materialization).

Author:
Edward Hayter
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab