synthanic_competition

This is the solution for the Synthanic competition on Kaggle.

The goal is to perform EDA and create a model solving binary classification task using synthetic dataset which is based on a real Titanic dataset. The statistical properties of this dataset are similar to the original (and well known) Titanic dataset.

Accuracy score on test set:

The notebook with solution contains:

Data quality assessment and missing data imputation.
Thorough Data exploration with many plots, observations, summary and feature engineering.
Modeling block were I compared 3 algorithms: Logistic Regression, KNN and Random Forests and did model tuning with RandomizeSearchCV, cross validation, feature selection, data scaling and encoding.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
README.md		README.md
synthanic_competition.ipynb		synthanic_competition.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

synthanic_competition

About

Releases

Packages

Languages

MeSugar/synth_competition

Folders and files

Latest commit

History

Repository files navigation

synthanic_competition

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages