In the space of Business Intelligence and Data Analytics, there is always the need for sourcing data and getting it ready for analysis through cleaning. The classical path is to have a Data Engineer extract data from various sources, transform the raw data and load into a data warehouse. This leaves Data Analysts with the task of querying tables with sql scripts to get desired data for analysis and reporting. The introduction of cloud data warehouses (snowflake, redshift etc.) has led to a transition from ETL to ELT. The difference here is now the presence of an Analytics Engineer sitting in between the Data Engineer and the Data Analyst with a duty of taking care of the (T) transformation in ELT (extract, load, transform) process.

With this method, there is reduction in data transfer time since raw data is always available in the data warehouse for transformation. That notwithstanding is the advantage of scalability in terms of storage and compute power. Lastly, the usage of simple sql SELECT statements will suffice in getting tables for analysis by Data Analysts. This is where dbt comes in as a handy tool for the ELT process.
In this write up, my goal is just to give a periphery introduction to dbt and why it stands out as a data transformation tool. Additional links to detailed explanations of its features has been added for your read.
What is dbt?
dbt stands for data build tool. For a basic understanding of what dbt does, I liken it to the process undergone in manufacturing companies. These companies usually take raw materials and process them to a point where it is ready for human usage. Just like a manufacturing house, dbt connects to data warehouses such as (Big-query, Snowflake etc.) and transform raw data and to a point where it is ready to be used for generation of insights to inform strategic business decisions. The transformation process entails sourcing, modelling, testing, documentation and deployment. The end goal is to make data accessibility simpler, data transformations faster yet robust, and the maintenance of data pipelines very easy. We know that after products get to the market, consumers do have preferences and hence often check brand names to assess whether they are getting value for their purchase. This leads to the question WHY DBT?
Why dbt?
dbt as a data transformation tool comes with loads of pros and hence the value it adds to the analytics eco-system.
- Open-source: dbt comes in 2 forms, dbt cloud and dbt core (using CLI) which is free.
- Uses SQL: dbt employs sql scripts in building its models. This makes its usage easier for anyone who has basic knowledge in sql scripting.
- DAG presence: dbt comes with a lineage graph (directed acyclic graph) which aids in tracking the flow of data transformation from sources through to all models.
- Modularity: dbt breaks down complex sql scripts into simple chunks and sets relations between them. This makes script comprehension easier and troubleshooting quicker.
- Testing: dbt allows testing of models and sources for uniqueness, relationships, nulls and accepted values to enhance data integrity. Custom tests can also be defined by users.
- Documentation: dbt has a dynamic documentation process which saves time and also serve as reference for anybody working on an already created project.
- Version-control: dbt uses Git for code version-controls. This allows to track project history. The creation of a development branch often governs the project since changes made on that branch does not affect the main branch until merger.
- Packages: dbt allows the import of packages which could come in the form of macros or models to be used in a project.
- Community support: dbt has a support community which is opened to assist when users face road blocks.
With the forestated points, the role played by dbt in data transformation cant be understated. To get hands on with dbt, I recommend basic sql knowledge, data modelling and some understanding about Git. No need to be scared about these pre-requisites since you can gets started with dbt fundamentals course which is also free.