When dbt creates table or a view in a database, it creates it as: {{ database }}.{{ schema }}.{{ identifier }}
, e.g. analytics.finance.payments
. Schema
and database
are defined in the profiles.yml
file (CLI) or in Credentials
(cloud), when setting up a project. In this post, we will tailor make our own schema name, hijacking dbt default behaviour to write models to Snowflake.
Credentials
When we define a project within dbt cloud we usually set a database (target.database
), a schema (target.schema
) and target.name
as our development credentials
. Once dbt runs, it will build our models into a schema with the name we defined. For deployment environment, we would have to define another set of credentials (and among them database
and a schema
), while target.name
is defined per job.
Default dbt Behaviour
By default, dbt uses a macro to generate a schema_name
based on target.schema
and a custom_schema_name
. The standard behaviour of dbt is:
- If a
custom_schema_name
is not specified, the schema of the relation is the target schema
({{ target.schema }}
). - If a
custom_schema_name
is specified, by default, the schema of the relation is{{ target.schema }}_{{
.custom_schema_name
}}
If our project includes a macro that is also named generate_schema_name
, dbt will always use that instead of the default macro.
Custom schema name
The custom_schema_name
configuration attribute can be applied to a specific model by using a config block within a model: {{ config(schema='marketing') }}
. Alternatively, we can apply it to a subdirectory of models by specifying it in the dbt_project.yml
file:
# models in `models/marketing/ will be rendered to the "*_marketing" schema
models:
my_project:
marketing:
+schema: marketing
A Tailored schema name
In order to change and create our own tailored schema_name we should include a macro that is also named generate_schema_name
. I usually create another folder under macros and include the alternative macro there (macros/configs/generate_schema_name.sql
) :
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ target.name }}_{{ default_schema }}
{%- else -%}
{{ target.name }}_{{ default_schema }}_{{ custom_schema_name }}
{%- endif -%}
{%- endmacro %}
Simply speaking, if target.name = dev
, target.schema = analytics_abenbinyamin
and no custom_schema_name
is defined, then the generated schema will be dev_analytics_abenbinyamin
.
On the other hand, in production, where target.name = prod
, target.schema = analytics
and no custom_schema_name
is defined, then the generated schema will be prod_analytics
.
If custom_schema_name
is defined (in a model or in dbt_project.yml file, as mentioned above) then we’ll get dev_analytics_abenbinyamin_marketing
for development environment or prod_analytics_marketing
for production.
This method enables me to tweak and create meaningful schema names according to the needs of the project.
Thank you for reading this blog and feel free to check out my other blogs.
Contact us, and we’ll help you right away.