Skip to content

This is a GitHub repository for the Stat 154 Final Project involving the US Accidents Dataset.

Notifications You must be signed in to change notification settings

ritvik-iyer/accidents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

US Traffic Accidents

This is the final project for UC Berkeley's Stat 154: Modern Statistical Prediction and Machine Learning.

You can access our project write-up detailing our data exploration and modeling process here. The models produced in this project have been submitted to the class Kaggle competition hosted here.

Project Status: Completed

Project Intro/Objective

The purpose of this project is to predict the severity of traffic accidents in the United States, using real-time traffic, location, and weather data from around 3 million accidents spanning 49 states. Our objective was to design a binary classifier for severe accident detection.

Methods Used

  • Data Visualization
  • Feature Engineering
  • Machine Learning
  • Sampling Methods

Technologies and Packages

  • R
    • Packages: tidyverse, dplyr, ggplot2, caret, glmnet, lubridate, e1071, MASS
  • Python
    • Packages: pandas, numpy, scikit-learn, seaborn, matplotlib, pytorch
  • Jupyter Notebook

Getting Started

  1. Clone this repo (for help see this tutorial).
  2. The datasets used and created during this project can be accessed here. Place the training, validation, and test sets in the general ~/accidents directory.
  3. Data processing/transformation scripts are kept here
  4. The models are kept here. To run a model, navigate to the ~/accidents/models directory and run the model of choice.

Contributing Members

  • Devan Jaganath
  • Ritvik Iyer
  • Ryan Chien

About

This is a GitHub repository for the Stat 154 Final Project involving the US Accidents Dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published