Skip to content

miksut/data_science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Concepts

This repository hosts some implementations of prominent Data Science concepts. These implementations come in form of Jupyter Notebooks and can be found in the folder src/. Concretely, the folder contains the following Notebooks:

  • artificial_neural_network.ipynb: Composing a neural network that performs the MNIST classification task. The workflow begins by finding a suitable network and the associated hyperparameters. Then, an exhaustive hyperparameter search is performed to find the optimal configuration. This is followed by fine-tuning the optimally configured network in an attempt to increase its robustness. Finally, the network is evaluated on the test dataset. The implementation relies on the deep learning framework Keras and the Machine Learning platform Tensorflow.

  • generative_vs_discriminative_models.ipynb: Comparison of discriminative and generative learning as typified by logistic regression and naive Bayes. The comparison is based on the paper by Andrew Ng and Michael Jordan.

  • regression.ipynb: Implementation of a Linear Regression model using the least squares method. Furthermore, concepts such as regularization, polynomial basis expansion, and cross validation are covered as well.

  • transfer_learning.ipynb: Extending a convolutional neural network, that has been pretrained on ImageNet, with a collection of additional layers (i.e., the custom model) to solve the classification task on the CIFAR-10 dataset. The workflow includes data augmentation, the convergence of the custom model on the dataset, the fine-tunig of the overall model, and its evaluation on the test dataset. The implementation relies on the deep learning framework Keras and the Machine Learning platform Tensorflow.

The original code is the result of a collaboration with two fellow UZH students (julwil and cdeiac) and is linked to a lecture offered by the Data Systems and Theory Group in the Department of Informatics at the University of Zurich, Switzerland. This repository contains a slightly revised version of the original code.