Skip to content

Commit

Permalink
Check for period over period changes in column sum
Browse files Browse the repository at this point in the history
  • Loading branch information
m-d-bowerman committed Nov 9, 2023
1 parent 0c61978 commit d616254
Show file tree
Hide file tree
Showing 2 changed files with 91 additions and 0 deletions.
22 changes: 22 additions & 0 deletions docs/reference/data_checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,28 @@ Example:
{{ not_null(["submission_date", "os"], "submission_date = @submission_date") }}
```

### period_over_period_change ([source](../../tests/checks/period_over_period_change.jinja))

Usage:

```
Arguments:
metric_columns: List[str] - A list of columns to calculate the period-over-period percent change.
period_column: str - The column that defines the periods over which changes in the metric columns will be summed.
threshold: float - The value in (0, 1) under which the percent change in metrics is expected to fall.
dimensions: Optional[List[str]] - A list of columns to define the dimensions over which the metrics should be summed.
date_where: Optional[str] - A condition that will be injected into the `WHERE` clause of the check that defines which date to base the comparison on. Metrics in the period that satisfies this date condition will be the value for the percent change calcuation based on the previous period's value. For example, "created_date = @submission_date" so that the check is only executed when a new row is created.
field_where: Optional[str] - A condition that will be injected into the `WHERE` clause of the check. For example, "partner_code = 'Google'" so that the check is only executed for a specific partner.
```

Example:

```sql
#fail
{{ period_over_period_change(["impressions", "clicks"], "submission_date", .25, ["country_code", "form_factor"], "submission_date = @submission_date", "country_code IN ('US','DE')") }}
```

Please keep in mind the below checks can be combined and specified in the same `checks.sql` file. For example:

```sql
Expand Down
69 changes: 69 additions & 0 deletions tests/checks/period_over_period_change.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
{% macro period_over_period_change(columns, period_column, threshold, dimensions, date_where, field_where) %}
{% if columns is string %}
{% set columns = [columns] %}
{% endif %}
{% if dimensions is string %}
{% set dimensions = [dimensions] %}
{% endif %}
WITH cte AS (
SELECT
{{ period_column }},
{% for dim in dimensions %}
{{ dim }},
{% endfor %}
{% for col in columns %}
SUM(COALESCE({{ col }}, 0)) AS {{ col }},
{% endfor %}
FROM
`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`
{% if field_where %}
WHERE {{ field_where }}
{% endif %}
GROUP BY
{% for dim in dimensions %}
{{ dim }},
{% endfor %}
{{ period_column }}
),
changes AS (
SELECT [
{% for col in columns %}
IF(
COUNTIF(
SAFE_DIVIDE(
{{ col }},
LAG({{ col }})
OVER(
{% if dimensions %}
PARTITION BY
{% for dim in dimensions %}
{{ dim }}
{{ ", " if not loop.last else "" }}
{% endfor %}
{% endif %}
ORDER BY {{ period_column }} ASC
)
) > {{ threshold }}
) > 0,
"{{ col }}",
NULL
){% if not loop.last -%},{% endif -%}
{% endfor %}
] AS checks
FROM cte
WHERE {{ date_where }}
),
change_checks AS (
SELECT ARRAY_AGG(u IGNORE NULLS) AS checks FROM changes, UNNEST(checks) AS u
)
SELECT IF(
(SELECT ARRAY_LENGTH(checks) FROM change_checks) > 0,
ERROR(
CONCAT(
"Columns with period over period changes over {{ threshold }}: ",
(SELECT ARRAY_TO_STRING(checks, ", ") FROM range_checks)
)
),
NULL
);
{% endmacro %}

1 comment on commit d616254

@dataops-ci-bot
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration report for "Check for period over period changes in column sum"

sql.diff

Click to expand!
Only in /tmp/workspace/generated-sql/sql/: glam-fenix-dev
Only in /tmp/workspace/generated-sql/sql/: moz-fx-cjms-nonprod-9a36
Only in /tmp/workspace/generated-sql/sql/: moz-fx-cjms-prod-f3c7
Only in /tmp/workspace/generated-sql/sql/: moz-fx-data-bq-performance
Only in /tmp/workspace/generated-sql/sql/: moz-fx-data-experiments
Only in /tmp/workspace/generated-sql/sql/: moz-fx-data-marketing-prod
Only in /tmp/workspace/generated-sql/sql/: moz-fx-data-shared-prod
Only in /tmp/workspace/generated-sql/sql/: mozfun
Only in /tmp/workspace/main-generated-sql/sql/: sql

Please sign in to comment.