When dbt creates table or a view in a database, it creates it as: {{ database }}.{{ schema }}.{{ identifier }}, e.g. analytics.finance.payments. Schema and database are defined in the profiles.yml file (CLI) or in Credentials (cloud), when setting up a project. In this post, we will tailor make our own schema name, hijacking dbt default behaviour to write models to Snowflake.

Credentials

When we define a project within dbt cloud we usually set a database (target.database), a schema (target.schema) and target.name as our development credentials. Once dbt runs, it will build our models into a schema with the name we defined. For deployment environment, we would have to define another set of credentials (and among them database and a schema), while target.name is defined per job.

Default dbt Behaviour

By default, dbt uses a macro to generate a schema_name based on target.schema and a custom_schema_name. The standard behaviour of dbt is:

  • If a custom_schema_name is not specified, the schema of the relation is the target schema
    ({{ target.schema }}).
  • If a custom_schema_name is specified, by default, the schema of the relation is
    {{ target.schema }}_{{ custom_schema_name }}.

If our project includes a macro that is also named generate_schema_name, dbt will always use that instead of the default macro.

Custom schema name

The custom_schema_name configuration attribute can be applied to a specific model by using a config block within a model: {{ config(schema='marketing') }}. Alternatively, we can apply it to a subdirectory of models by specifying it in the dbt_project.yml file:

# models in `models/marketing/ will be rendered to the "*_marketing" schema
models:
  my_project:
    marketing:
      +schema: marketing

A Tailored schema name

In order to change and create our own tailored schema_name we should include a macro that is also named generate_schema_name. I usually create another folder under macros and include the alternative macro there (macros/configs/generate_schema_name.sql) :

{% macro generate_schema_name(custom_schema_name, node) -%}

    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}

        {{ target.name }}_{{ default_schema }}

    {%- else -%}

        {{ target.name }}_{{ default_schema }}_{{ custom_schema_name }}

    {%- endif -%}

{%- endmacro %}

Simply speaking, if target.name = dev, target.schema = analytics_abenbinyamin and no custom_schema_name is defined, then the generated schema will be dev_analytics_abenbinyamin.

On the other hand, in production, where target.name = prod, target.schema = analytics and no custom_schema_name is defined, then the generated schema will be prod_analytics.

If custom_schema_name is defined (in a model or in dbt_project.yml file, as mentioned above) then we’ll get dev_analytics_abenbinyamin_marketing for development environment or prod_analytics_marketing for production.

This method enables me to tweak and create meaningful schema names according to the needs of the project.

Thank you for reading this blog and feel free to check out my other blogs.

Contact us, and we'll help you right away.