Standard way of executing all dbt models
In dbt, you can conceptually build and design your models as you wish, and then execute them as a final product in your database in the form of tables and/or views (we will use Snowflake for this). To run your models, use the instruction, dbt run.
By default, the dbt run instruction will run all models in your dependency graph . Such a dependency graph looks like this, for instance:
Looking at our folder structure in dbt, according to the dependency graph above, only the following models will be executed (converted to tables/views in snowflake):
Run only part of your models
It may be that you do not want to run all your models at once, but some of them. In dbt, this is possible by placing the '-m ' or '-select' option statement next to the dbt run. In our example, we will proceed as follows and select only the following 3 models that we want to run:
'dbt run -m stg_orders stg_customers fct_orders
We can see that these models have been successfully implemented in dbt:
Dbt tags for more complex selections of your models
In the example above, we saw that we could run 3 consecutive models by placing them next to the 'dbt run' instruction with the option instruction '-m'. But what if we are working on a huge project with 100 models and we only want to run a few of them (for example, 6 models)? Typing all these models into the command line can be tedious and messy!
In this case, dbt tags come in handy. dbt tags can be applied to different kinds of resources, including models, snapshots, seeds, sources, tables, columns and many more. We will not go into detail about all these resources here, but only focus on the tags applied to models.
We start with the syntax of dbt tags we are going to use - it looks like this: 'dbt run -m (or -select) tag:my_tag' (note: 'tag' is singular!)
We can set/configuredbt tags in 2 ways:
- From a central location, in the 'dbt_project' yml existing. This is the default dbt project existing where all other settings are set.
- Another option is to apply the tags not from a central point, but within an individual model. We can use the dbt 'config' function for this.
Additionally, it is possible to give one tag several names, hence the square brackets (indicating that it is a list) in the examples above. If you wish to rename your tag anyway, you can do so without using the square brackets.
Now let's add a tag to our models in our project. We will use the models that are in the 'Core' folder (Models/Marts/Core). We will configure our dbt tag in the dbt_project.yml file. The Models/Marts/Core folder contains 6 models. Without tags, this means we would have to type all models individually into the command line with the dbt run -m instruction to execute them, but since we are now going to work with tags, we can achieve the same results by using a tag instead. We will name our tag: 'core_models'. To make this easier to illustrate, we will execute the models in a new schema, which we will call 'tag_schema' (as with tags, we can also set this new schema in the same standard dbt_project.yml).
We can see that all 6 models have been successfully implemented in dbt:
We can also see that the results have been successfully exported to our data warehouse, Snowflake, in the new schema: tag_schema (Snowflake displays all defined names in capital letters by default).
Dbt tags can be very useful when you are dealing with several complex models that you do not want to run all at once. Dbt tags can also be useful if you decide to run your models in groups.
Finally, it is also possible to run several dbt tags at the same time. You can even exclude certain models within your tag. More information on this can be found here: https://docs.getdbt.com/reference/resource-configs/tags