Skip to content

Commit

Permalink
Migrate Off Circle CI / To Github Actions + dagger.io (#923)
Browse files Browse the repository at this point in the history
* Add Github action for integration test

* Update tox

* Fetch spark from https link

* Use Spark version 3.1.2

* Seperate running Spark session and thrift

* Use Spark 3.1.2 and Hadoop 3.2

* Reset tox.ini

* Remove base pythons in tox.ini

* Fix reference to Docker compose file

* Remove timeout

* Remove artifact steps

* Bump Spark and Hadoop versions

* Reset Spark and Hadoop version

* Update comment

* Add changie

* add databricks and PR execution protections

* use single quotes

* remove `_target` suffix

* add comment to test

* specify container user as root

* formatting

* remove python setup for pre-existing container

* download simba

* fix curl call

* fix curl call

* fix curl call

* fix curl call

* fix curl call

* fix curl call

* fix db test naming

* confirm ODBC driver installed

* add odbc driver env var

* add odbc driver env var

* specify platform

* check odbc driver integrity

* add dbt user env var

* add dbt user env var

* fix host_name env var

* try removing architecture arg

* swap back to pull_request_target

* try running on host instead of container

* Update .github/workflows/integration.yml

Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>

* try running odbcinst -j

* remove bash

* add sudo

* add sudo

* update odbc.ini

* install libsasl2-modules-gssapi-mit

* install libsasl2-modules-gssapi-mit

* set -e on odbc install

* set -e on odbc install

* set -e on odbc install

* sudo echo odbc.inst

* remove postgres components

* remove release related items

* remove irrelevant output

* move long bash script into its own file

* update integration.yml to align with other adapters

* revert name change

* revert name change

* combine databricks and spark tests

* combine databricks and spark tests

* Add dagger

* remove platform

* add dagger setup

* add dagger setup

* set env vars

* install requirements

* install requirements

* add DEFAULT_ENV_VARS and test_path arg

* remove circle ci

* formatting

* update changie

* Update .changes/unreleased/Under the Hood-20230929-161218.yaml

Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>

* formatting fixes and simplify env_var handling

* remove tox, update CONTRIBUTING.md and cleanup GHA workflows

* remove tox, update CONTRIBUTING.md and cleanup GHA workflows

* install test reqs in main.yml

* install test reqs in main.yml

* formatting

* remove tox from dev-requirements.txt and Makefile

* clarify spark crt instantiation

* add comments on python-version

---------

Co-authored-by: Cor Zuurmond <jczuurmond@protonmail.com>
Co-authored-by: Florian Eiden <florian.eiden@fleid.fr>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Mike Alfare <13974384+mikealfare@users.noreply.github.com>
Co-authored-by: Mike Alfare <mike.alfare@dbtlabs.com>
  • Loading branch information
6 people authored Jan 10, 2024
1 parent 5210d0a commit f9f75e9
Show file tree
Hide file tree
Showing 20 changed files with 408 additions and 242 deletions.
6 changes: 6 additions & 0 deletions .changes/unreleased/Under the Hood-20230929-161218.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Under the Hood
body: Add GitHub action for integration testing and use dagger-io to run tests. Remove CircleCI workflow.
time: 2023-09-29T16:12:18.968755+02:00
custom:
Author: JCZuurmond, colin-rogers-dbt
Issue: "719"
136 changes: 0 additions & 136 deletions .circleci/config.yml

This file was deleted.

17 changes: 17 additions & 0 deletions .github/scripts/update_dbt_core_branch.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash -e
set -e

git_branch=$1
target_req_file="dev-requirements.txt"
core_req_sed_pattern="s|dbt-core.git.*#egg=dbt-core|dbt-core.git@${git_branch}#egg=dbt-core|g"
tests_req_sed_pattern="s|dbt-core.git.*#egg=dbt-tests|dbt-core.git@${git_branch}#egg=dbt-tests|g"
if [[ "$OSTYPE" == darwin* ]]; then
# mac ships with a different version of sed that requires a delimiter arg
sed -i "" "$core_req_sed_pattern" $target_req_file
sed -i "" "$tests_req_sed_pattern" $target_req_file
else
sed -i "$core_req_sed_pattern" $target_req_file
sed -i "$tests_req_sed_pattern" $target_req_file
fi
core_version=$(curl "https://github.com/raw/dbt-labs/dbt-core/${git_branch}/core/dbt/version.py" | grep "__version__ = *"|cut -d'=' -f2)
bumpversion --allow-dirty --new-version "$core_version" major
112 changes: 112 additions & 0 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# **what?**
# Runs integration tests.

# **why?**
# Ensure code runs as expected.

# **when?**
# This will run for all PRs, when code is pushed to a release
# branch, and when manually triggered.

name: Adapter Integration Tests

on:
push:
branches:
- "main"
- "*.latest"

pull_request_target:
paths-ignore:
- ".changes/**"
- ".flake8"
- ".gitignore"
- "**.md"

workflow_dispatch:
inputs:
dbt-core-branch:
description: "branch of dbt-core to use in dev-requirements.txt"
required: false
type: string

# explicitly turn off permissions for `GITHUB_TOKEN`
permissions: read-all

# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
concurrency:
group: ${{ github.workflow }}-${{ github.event_name }}-${{ contains(github.event_name, 'pull_request_target') && github.event.pull_request.head.ref || github.sha }}
cancel-in-progress: true

defaults:
run:
shell: bash

jobs:

test:
name: ${{ matrix.test }}
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
test:
- "apache_spark"
- "spark_session"
- "databricks_sql_endpoint"
- "databricks_cluster"
- "databricks_http_cluster"

env:
DBT_INVOCATION_ENV: github-actions
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
DBT_DATABRICKS_CLUSTER_NAME: ${{ secrets.DBT_DATABRICKS_CLUSTER_NAME }}
DBT_DATABRICKS_HOST_NAME: ${{ secrets.DBT_DATABRICKS_HOST_NAME }}
DBT_DATABRICKS_ENDPOINT: ${{ secrets.DBT_DATABRICKS_ENDPOINT }}
DBT_DATABRICKS_TOKEN: ${{ secrets.DBT_DATABRICKS_TOKEN }}
DBT_DATABRICKS_USER: ${{ secrets.DBT_DATABRICKS_USERNAME }}
DBT_TEST_USER_1: "buildbot+dbt_test_user_1@dbtlabs.com"
DBT_TEST_USER_2: "buildbot+dbt_test_user_2@dbtlabs.com"
DBT_TEST_USER_3: "buildbot+dbt_test_user_3@dbtlabs.com"

steps:
- name: Check out the repository
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

# explicitly checkout the branch for the PR,
# this is necessary for the `pull_request` event
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

# the python version used here is not what is used in the tests themselves
- name: Set up Python for dagger
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install python dependencies
run: |
python -m pip install --user --upgrade pip
python -m pip --version
python -m pip install -r dagger/requirements.txt
- name: Update dev_requirements.txt
if: inputs.dbt-core-branch != ''
run: |
pip install bumpversion
./.github/scripts/update_dbt_core_branch.sh ${{ inputs.dbt-core-branch }}
- name: Run tests for ${{ matrix.test }}
run: python dagger/run_dbt_spark_tests.py --profile ${{ matrix.test }}
15 changes: 6 additions & 9 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ on:
branches:
- "main"
- "*.latest"
- "releases/*"
pull_request:
workflow_dispatch:

Expand Down Expand Up @@ -81,10 +80,6 @@ jobs:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]

env:
TOXENV: "unit"
PYTEST_ADDOPTS: "-v --color=yes --csv unit_results.csv"

steps:
- name: Check out the repository
uses: actions/checkout@v3
Expand All @@ -100,10 +95,12 @@ jobs:
sudo apt-get install libsasl2-dev
python -m pip install --user --upgrade pip
python -m pip --version
python -m pip install tox
tox --version
- name: Run tox
run: tox
python -m pip install -r requirements.txt
python -m pip install -r dev-requirements.txt
python -m pip install -e .
- name: Run unit tests
run: python -m pytest --color=yes --csv unit_results.csv -v tests/unit

- name: Get current date
if: always()
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,5 @@ test.env
.hive-metastore/
.spark-warehouse/
dbt-integration-tests
/.tool-versions
/.hypothesis/*
24 changes: 20 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,27 @@ $EDITOR test.env
### Test commands
There are a few methods for running tests locally.

#### `tox`
`tox` takes care of managing Python virtualenvs and installing dependencies in order to run tests. You can also run tests in parallel, for example you can run unit tests for Python 3.8, Python 3.9, and `flake8` checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py38`. The configuration of these tests are located in `tox.ini`.
#### dagger
To run functional tests we rely on [dagger](https://dagger.io/). This launches a virtual container or containers to test against.

#### `pytest`
Finally, you can also run a specific test or group of tests using `pytest` directly. With a Python virtualenv active and dev dependencies installed you can do things like:
```sh
pip install -r dagger/requirements.txt
python dagger/run_dbt_spark_tests.py --profile databricks_sql_endpoint --test-path tests/functional/adapter/test_basic.py::TestSimpleMaterializationsSpark::test_base
```

`--profile`: required, this is the kind of spark connection to test against

_options_:
- "apache_spark"
- "spark_session"
- "databricks_sql_endpoint"
- "databricks_cluster"
- "databricks_http_cluster"

`--test-path`: optional, this is the path to the test file you want to run. If not specified, all tests will be run.

#### pytest
Finally, you can also run a specific test or group of tests using `pytest` directly (if you have all the dependencies set up on your machine). With a Python virtualenv active and dev dependencies installed you can do things like:

```sh
# run all functional tests
Expand Down
7 changes: 4 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
.PHONY: dev
dev: ## Installs adapter in develop mode along with development dependencies
@\
pip install -e . -r requirements.txt -r dev-requirements.txt && pre-commit install
pip install -e . -r requirements.txt -r dev-requirements.txt -r dagger/requirements.txt && pre-commit install

.PHONY: dev-uninstall
dev-uninstall: ## Uninstalls all packages while maintaining the virtual environment
Expand Down Expand Up @@ -40,12 +40,13 @@ linecheck: ## Checks for all Python lines 100 characters or more
.PHONY: unit
unit: ## Runs unit tests with py38.
@\
tox -e py38
python -m pytest tests/unit

.PHONY: test
test: ## Runs unit tests with py38 and code checks against staged changes.
@\
tox -p -e py38; \
python -m pytest tests/unit; \
python dagger/run_dbt_spark_tests.py --profile spark_session \
pre-commit run black-check --hook-stage manual | grep -v "INFO"; \
pre-commit run flake8-check --hook-stage manual | grep -v "INFO"; \
pre-commit run mypy-check --hook-stage manual | grep -v "INFO"
Expand Down
3 changes: 0 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
<a href="https://github.com/dbt-labs/dbt-spark/actions/workflows/main.yml">
<img src="https://github.com/dbt-labs/dbt-spark/actions/workflows/main.yml/badge.svg?event=push" alt="Unit Tests Badge"/>
</a>
<a href="https://circleci.com/gh/dbt-labs/dbt-spark/?branch=main">
<img src="https://circleci.com/gh/dbt-labs/dbt-spark/tree/main.svg?style=shield" alt="Integration Tests Badge"/>
</a>
</p>

**[dbt](https://www.getdbt.com/)** enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Expand Down
2 changes: 2 additions & 0 deletions dagger/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
dagger-io~=0.8.0
python-dotenv
Loading

0 comments on commit f9f75e9

Please sign in to comment.