Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2819] [Bug] default__alter_relation_add_remove_columns macro does not use quoting with case sensitive Snowflake relation #250

Open
2 tasks done
crystalro0 opened this issue Jul 12, 2023 · 4 comments · May be fixed by #256
Labels
bug Something isn't working

Comments

@crystalro0
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

While working with DBT incremental config: on_schema_change='append_new_columns'
The append new columns flag is not able to capture the correct case-sensitive column name and add it to the incremental table causing the run to fail.

Expected Behavior

Be able to apply to add the new columns to an incremental table with the 'append_new_columns' config.

Steps To Reproduce

  1. Environment should have an existing incremental materialization while on_schema_change='append_new_columns'.
  2. Add a case-sensitive column name like "Schema_Test" in the source table and rerun the incremental table. The error will be reproduced.

Relevant log output

invalid identifier '"Schema_Test"'

Environment

- OS:
- Python:
- dbt: 1.5

Which database adapter are you using with dbt?

snowflake

Additional Context

suggested the following as a workaround:

     {% for column in add_columns %}
        add column "{{ column.name }}" {{ column.data_type }}{{ ',' if not loop.last }}
      {% endfor %}{{ ',' if remove_columns | length > 0 }}
            
     {% for column in remove_columns %}
       drop column "{{ column.name }}" {{ ',' if not loop.last }}
     {% endfor %}

saved in project macro folder.

@crystalro0 crystalro0 added bug Something isn't working triage labels Jul 12, 2023
@github-actions github-actions bot changed the title [Bug] default__alter_relation_add_remove_columns macro does not use quoting with case sensitive Snowflake relation [CT-2819] [Bug] default__alter_relation_add_remove_columns macro does not use quoting with case sensitive Snowflake relation Jul 12, 2023
@dbeatty10 dbeatty10 self-assigned this Jul 12, 2023
@dbeatty10
Copy link
Contributor

Thanks for reporting this @crystalro0 🤩

Adding an override to alter_relation_add_remove_columns like you mentioned worked for me as well.

Reprex

models/src_artists.sql

{{
    config(
        materialized='table',
    )
}}

{% if var("version", 0) == 0 %}

    select {{ dbt.current_timestamp() }} as inserted_at, 'taylor' as name

{% else %}

    -- add a non-zero version to the end of the command to get a different version:
    -- --vars "{'version': 1}"
    select {{ dbt.current_timestamp() }} as inserted_at, 'taylor' as Name, 'eras' as "Tour"

{% endif %}

models/dim_artists.sql

{{
    config(
        materialized='incremental',
        on_schema_change='append_new_columns',
    )
}}

select  * from {{ ref("src_artists") }}

Add a case-sensitive column to an incremental model and look at the results afterwards

dbt run --full-refresh
dbt show --inline "select * from {{ ref('dim_artists') }}"
dbt run --vars "{'version': 1}"
dbt show --inline "select * from {{ ref('dim_artists') }}"

Root cause

Ultimately, I believe the root cause is that the current implementation does not match the quoting being applied in this trace of code (listed in reverse order):

Solution

If we update the implementation of default__alter_relation_add_remove_columns, I'd suggest using adapter.quote like elsewhere in that file to achieve the same effect:

            {% for column in add_columns %}
                add column {{ adapter.quote(column.name) }} {{ column.data_type }}{{ ',' if not loop.last }}
            {% endfor %}{{ ',' if remove_columns | length > 0 }}
                    
            {% for column in remove_columns %}
            drop column {{ adapter.quote(column.name) }} {{ ',' if not loop.last }}
            {% endfor %}

Acceptance criteria

  • Add applicable test cases within dbt-core and inherit within dbt-snowflake (confirming that an error is raised in dbt-snowflake)
  • Solve the reported issue, optionally using an implementation like above

@dbeatty10 dbeatty10 removed the triage label Jul 12, 2023
@dbeatty10 dbeatty10 removed their assignment Jul 12, 2023
@alexnikitchuk
Copy link

we face the same issue with postgres adapter. Any chance it will be fixed?

@colin-rogers-dbt
Copy link
Contributor

let's also validate against dbt-postgres

@jeremyyeo
Copy link

Same deal on BQ even if you set the quote config on the newly added column:

-- models/stg_customers.sql
{{
    config(
        materialized='incremental',
        unique_key='id',
        on_schema_change='sync_all_columns'
    )
}}

select 1 id, 'alice' as first_name
# models/schema.yml
models:
  - name: stg_customers
    columns:
      - name: id
      - name: first_name

Do a first build without issues.

$ dbt --debug build --full-refresh
00:17:25  On model.my_dbt_project.stg_customers: /* {"app": "dbt", "dbt_version": "1.8.5", "profile_name": "all", "target_name": "bq", "node_id": "model.my_dbt_project.stg_customers"} */

    create or replace table `cse-sandbox-319708`.`dbt_jyeo`.`stg_customers`

    OPTIONS()
    as (

select 1 id, 'alice' as first_name
    );
  
00:17:26  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:49d34f81-2697-4f4e-9928-d6bbf353f337&page=queryresults
00:17:29  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '1d114b72-c90f-42be-9dfc-173e7831f4c2', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12bae6b50>]}
00:17:29  1 of 1 OK created sql incremental model dbt_jyeo.stg_customers ................. [CREATE TABLE (1.0 rows, 0 processed) in 7.09s]

Add a new column but using a reserved SQL keyword:

-- models/stg_customers.sql
{{
    config(
        materialized='incremental',
        unique_key='id',
        on_schema_change='sync_all_columns'
    )
}}

select 1 id, 'alice' as first_name, 'premium' as `group`
# models/schema.yml
models:
  - name: stg_customers
    columns:
      - name: id
      - name: first_name
      - name: group
        quote: true
$ dbt --debug build
00:18:50  On model.my_dbt_project.stg_customers: /* {"app": "dbt", "dbt_version": "1.8.5", "profile_name": "all", "target_name": "bq", "node_id": "model.my_dbt_project.stg_customers"} */

    create or replace table `cse-sandbox-319708`.`dbt_jyeo`.`stg_customers__dbt_tmp`
      
    OPTIONS(
      expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 12 hour)
    )
    as (
      
select 1 id, 'alice' as first_name, 'premium' as `group`
    );
  
00:18:53  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:5fdf3b2d-82e2-46f6-8ae7-ece16515cb4a&page=queryresults
00:18:59  
    In `cse-sandbox-319708`.`dbt_jyeo`.`stg_customers`:
        Schema changed: True
        Source columns not in target: [<BigQueryColumn group (STRING, NULLABLE)>]
        Target columns not in source: []
        New column types: []
  
00:18:59  On model.my_dbt_project.stg_customers: /* {"app": "dbt", "dbt_version": "1.8.5", "profile_name": "all", "target_name": "bq", "node_id": "model.my_dbt_project.stg_customers"} */

    alter table `cse-sandbox-319708`.`dbt_jyeo`.`stg_customers`
               add column group STRING

00:19:00  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:28b881d0-a878-452b-9ba1-3f9bbc404e67&page=queryresults
00:19:00  BigQuery adapter: Retry attempt 1 of 1 after error: BadRequest('Syntax error: Unexpected keyword GROUP at [6:27]; reason: invalidQuery, location: query, message: Syntax error: Unexpected keyword GROUP at [6:27]')
00:19:01  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:2c470b5b-5562-4550-ad66-9e473e1725ee&page=queryresults
00:19:01  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:2c470b5b-5562-4550-ad66-9e473e1725ee&page=queryresults
00:19:01  Database Error in model stg_customers (models/stg_customers.sql)
  Syntax error: Unexpected keyword GROUP at [6:27]
00:19:01  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '9c085ee1-adb6-4667-9365-7e77f2c75256', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x149853710>]}
00:19:01  1 of 1 ERROR creating sql incremental model dbt_jyeo.stg_customers ............. [ERROR in 11.46s]

^ In this scenario, the user has even gone the extra mile to set the quote config on the column - but similar not respected when being added to the existing table.


This issue should be renamed - something like:

Across all adapters, dbt does not quote column names when being added to an existing incremental model via on_schema_change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
7 participants