Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 2.15 KB

README.md

File metadata and controls

38 lines (25 loc) · 2.15 KB

Toronto Overnight Shelter Data Project (R)


Project Overview

This repo contains the R project for modeling overnight service occupancy in homeless shelters in the city of Toronto. Note that is R projects is used along with the DBT project with housed the data models for the project.

Objectives

  • Data Extraction: Extract data from Open Data Toronto API. Also extract weather forecast data from AccuWeather API.
  • Data Loading: Load raw data into Big Query for further transformation using DBT
  • Machine Learning: Extract transformed data from Big Query. Use H2o AutoML to create machine learning models to predict overnight shelter occupancy for the next 5 days based on multiple features.
  • Data Reporting: Report on predictions using R related technologies and packages like Quarto, datatable, etc

Data Sources

  • Shelter Data: Toronto Open Data

  • Weather Data: NOAA Big Query Data for historical weather data and AccuWeather API for weather forecast data.

  • Technology Stack

  • DBT: For data transformation and modeling.

  • Google BigQuery: As the data warehouse for storing and querying the dataset.

  • Posit: For data manipulation using multiple packagers like tidyverse, data.table, DT, DBI, h2o.

  • H2o AutoML: For building machine learning models

  • Quarto: For reporting

  • Repository Structure

  • /scripts_dev: Contains development scripts for analysis.

  • /scripts_prod; Contains production scripts that run on schedule.

  • /functions: Contains functions to modularize code.

  • /app: Contains scripts for shiny app.

Resources

Coming soon...