Skip to content

Ferlab-Ste-Justine/unic-dag

Repository files navigation

UNIC dag

Set up Python virtual environment

Create venv :

python -m venv venv

Activate venv :

source venv/bin/activate

Install requirements :

pip install -r requirements

Run DAG with config

Choose a DAG and select Trigger DAG w/ config. In the JSON config, you can specify the following parameters to overwrite these default values:

{
  "branch": "master",
  "version": "latest"
}

Release a version

To release a version of an enriched dataset and publish it to researchers, run the enriched DAG with a config specifying the version number.

{
  "version": "1.0.0"
}

The version number must follow this format : "x.x.x" where x is a number.

Run Airflow locally with Docker

Create .env file :

cp .env.sample .env

Deploy stack :

docker-compose up

Login to Airflow UI :

  • URL : http://localhost:50080
  • Username : airflow
  • Password : airflow

Create Airflow variables (Airflow UI => Admin => Variables) :

  • dags_path : /opt/airflow/dags
  • base_url (optional) : http://localhost:50080

For faster variable creation, upload the variables.json file in the Variables page.

Test one task

docker-compose exec airflow-scheduler airflow tasks test <dag> <task> 2022-01-01

Optional : Set up Minio

Login to MinIO console :

  • URL : http://localhost:59001
  • Username : minioadmin
  • Password : minioadmin

Create Airflow variable (Airflow UI => Admin => Variables) :

  • s3_conn_id : minio

Create Airflow connection (Airflow UI => Admin => Connections) :

  • Connection Id : minio
  • Connection Type : Amazon S3
  • Extra :
{
  "host": "http://minio:9000",
  "aws_access_key_id": "minioadmin",
  "aws_secret_access_key": "minioadmin"
}

Optional : Set up Slack

Create Airflow variable (Airflow UI => Admin => Variables) :

  • slack_hook_url : https://hooks.slack.com/services/...