Skip to content

Demonstration of various OSS technologies to construct an ETL pipeline

Notifications You must be signed in to change notification settings

sayantikabanik/ETL-CLI-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ETL/CLI-from-scratch

Demonstration of various OSS technologies to construct an ETL/CLI pipeline

Dependency management using conda

Steps to create environment

conda env create --file environment.yml
conda activate etl
conda list
conda info
conda deactivate

Airflow

Creating user and running the webserver

1(a). Setting the user credentials using flask fab, follow the instructions on the command line

FLASK_APP=airflow.www.app flask fab create-admin

1(b). Creating user using airflow's create users command

airflow users create 
--username admin 
--password your_password 
--firstname your_first_name 
--lastname your_last_name 
--role Admin 
--email your_email@some.com
  1. Run the below command to confirm if the user is created
airflow users list
  1. Initialise airflow database
airflow db init
  1. Setup mysql database and secure it using password - macOS setup instructions

  2. Make changes in the airflow.cfg

  3. Star the airflow webserver and schedular

airflow webserver
airflow schedular

Doit

def task_compile():
return {'actions': ["cc -c main.c"],
            'file_dep': ["main.c", "defs.h"],
            'targets': ["main.o"],
            'doc': 'nice message'
            }

Typer

Command usage - python <filename.py> <agrs*>

References

About

Demonstration of various OSS technologies to construct an ETL pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages