Skip to content

Latest commit

 

History

History
26 lines (14 loc) · 1.2 KB

README.md

File metadata and controls

26 lines (14 loc) · 1.2 KB

ETL Project with Data Fusion, Airflow, and BigQuery

This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.

Refer youtube Video for this project

Part 1 : YouTube Part 2 : YouTube

Overview

The project aims to perform the following tasks:

  1. Data Extraction: Extract data using python.
  2. Data Masking: Apply data masking & encoding techniques to sensitive information in Cloud Data Fusion before loading it into BigQuery.
  3. Data Loading: Load transformed data into Google BigQuery tables.
  4. Orchestration: Automate complete Data pipeline using Airflow ( Cloud Composer )

image

Architecture

image