Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Suggestions for Snowflake Iceberg Configurations Doc #6365

Draft
wants to merge 1 commit into
base: current
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 12 additions & 11 deletions website/docs/reference/resource-configs/snowflake-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ To-do:

## Iceberg table format <Lifecycle status="beta"/>

The dbt-snowflake adapter supports the Iceberg table format. It is available for three of the Snowflake materializations:
The dbt-snowflake adapter supports the Iceberg table format. Prior to developing an Iceberg table with dbt, you need to [configure an external volume in Snowflake](https://docs.snowflake.com/en/user-guide/tables-iceberg-configure-external-volume).

In dbt, Iceberg is available for three of the Snowflake materializations:

- [Table](/docs/build/materializations#table)
- [Incremental](/docs/build/materializations#incremental)
- [Dynamic](#dynamic-tables)
- [Dynamic](#dynamic-tables)

For now, to create Iceberg tables, you must implement a [behavior flag](/reference/global-configs/behavior-changes) due to performance impact related to using Iceberg tables. Snowflake does not support `is_iceberg` on the `Show Objects` query, which dbt depends on for metadata.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the part about SHOW OBJECTS as it felt a little niche / tangental, but let me know if you disagree!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw this pop up - I disagree with this :) This is a dbt metadata performance impact rather than from using Iceberg tables. (also this helps snowflake customers advocate for us)

For now, to create Iceberg tables, you must implement a [behavior flag](/reference/global-configs/behavior-changes) due to performance impact related to using Iceberg tables.
amychen1776 marked this conversation as resolved.
Show resolved Hide resolved

To use Iceberg, set the `enable_iceberg_materializations` flag to `True` in your dbt_project.yml:

Expand All @@ -34,15 +36,14 @@ flags:

</File>


The following configurations are supported.
For more information, check out the Snowflake reference for [`CREATE ICEBERG TABLE` (Snowflake as the catalog)](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake).

| Field | Type | Required | Description | Sample input | Note |
| Parameter | Type | Required | Description | Sample input | Note |
| --------------------- | ------ | -------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| External volume | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake#:~:text=Snowflake%20Table%20Structures.-,external_volume) |
| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |
| `table_format` | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| `external_volume` | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/user-guide/tables-iceberg-configure-external-volume#set-a-default-external-volume-at-the-account-database-or-schema-level) |
| `base_location_subpath` | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you **do not specify this**. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |

### Example configuration

Expand All @@ -54,7 +55,7 @@ To configure an Iceberg table materialization in dbt, refer to the example confi

{{
config(
materialized = "table",
materialized="table",
table_format="iceberg",
external_volume="s3_iceberg_snow",
)
Expand All @@ -79,14 +80,14 @@ A theoretical (but not recommended) use case is re-using an `EXTERNAL VOLUME` wh

#### Rationale

dbt manages `base_location` on behalf of users to enforce best practices. With Snowflake-managed Iceberg format tables, the user owns and maintains the data storage of the tables in an external storage solution (the declared `external volume`). The `base_ location` parameter declares where to write the data within the external volume. The Snowflake Iceberg catalog keeps track of your Iceberg table regardless of where the data lives within the `external volume` declared and the `base_location` provided. However, Snowflake permits passing anything into the `base_location` field, including an empty string, even reusing the same path across multiple tables. This behavior could result in future technical debt because it will limit the ability to:
dbt manages `base_location` on behalf of users to enforce best practices. With Snowflake-managed Iceberg format tables, the user owns and maintains the data storage of the tables in an external storage solution (the declared `external volume`). The `base_location` parameter declares where to write the data within the external volume. The Snowflake Iceberg catalog keeps track of your Iceberg table regardless of where the data lives within the `external volume` declared and the `base_location` provided. However, Snowflake permits passing anything into the `base_location` field, including an empty string, even reusing the same path across multiple tables. This behavior could result in future technical debt because it will limit the ability to:

- Navigate the underlying object store (S3/Azure blob)
- Read Iceberg tables via an object-store integration
- Grant schema-specific access to tables via object store
- Use a crawler pointed at the tables within the external volume to build a new catalog with another tool

To maintain best practices, we enforce an input. Currently, we do not support overriding the default `base location` input but will consider it based on user feedback.
To maintain best practices, we enforce an input. Currently, we do not support overriding the default `base_location` input but will consider it based on user feedback.

In summary, dbt-snowflake does not support arbitrary definition of `base_location` for Iceberg tables. Instead, dbt, by default, writes your tables within a `_dbt/{SCHEMA_NAME}/{TABLE_NAME}` prefix to ensure easier object-store observability and auditability.

Expand Down
Loading