Skip to content

mkung8889/etl_project

Repository files navigation

ETL : Extract, Transform, Load Project:

Extract: read the data, often from mupltiple sources/formats.

Transform: clean and structure the data to suit business needs.

Load: load the data into a database for storage that can be used for future analysis or business use.

Project Repo

Objective:

Each of us chose (2) data sources to analyze different international energy sources consumed annually over the years.

Discussion Questions & Answers:

  1. Data sources:
  1. UN Data Total Electicity Consumption

  2. EIA Electricity Consumption By State

  3. EIA Natural Gas Consumption

  4. UN Data Natural Gas Consumption

  5. Enerdata World Oil Consumption

  6. EIA International Coal Consumption

All data extracted were in CSV format

  1. Decisions you made to do cleanup (transform) and join (transform)
  • Wrote a function to get the list of countries from UN database, it creates a country ID for unique countries, and this is used for joining purpose.

  • Renaming column names since SQL columns cannot start with an integer

  • drop rows that contained ‘NA’ data

  • converting string to nummeric

  1. How you decided on database tech to store, and schema to store

All of our data were in a CSV format, so we went with SQL to store the data.

  1. Potential analysis to do on the newly formed dataset

Compare energy consumption based on countries and year, create bar chart to see the trend of increasing energy usage, comparison between different energy source used and how that affects total energy consumption. Conduct analysis on why certain countries may consume more energy compare to other – this will require other data sets (country population for example).

  1. Challenges you overcame:
  • finding data that can be used for the project (relatable to what everyone else were finding year wise)

  • finding data based on countries instead of areas/continents

  • dropping unnecessary data and renaming columns in order to input it into SQL database

  • converting values from string to numeric data type

  • creating table in relation to SQL and jupyter notebook

  • learned how to use lambda

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published