Skip to content

All courses assignements, exams and projects done during the year study at Data ScienceTech Institute (DSTI).

Notifications You must be signed in to change notification settings

mrsebai/DSTI-coursework

Repository files navigation

DSTI-coursework

This repo contains all course assignments, exams, and projects produced during the year of study at Data ScienceTech Institute.

C-Programming Unit

We were assigned a fun little project: coding a Small Formal Computation Engine Using Reverse Polish Notation Implementation. The goal is to acquire from the user a mathematical expression in infixed notation and turn it into an expression in postfixed notation (also known as Reversed Polish Notation). The conversion from infixed to postfixed notation is implemented using The famous Shunting-yard algorithm.

Our implementation is in Plain C using a linked-list data structure and has the following features:

  • power ^ operator support along with +, -, *, /, ( and ) operators
  • support for multi-letter variable and their repetition
  • negative sign full support without collision with a subtraction operator

Some examples:

  • input in infixed notation:3 + 4 × 2 ÷ ( 1 − 5 ) ^ 2 ^ 3
  • the output of the same expression in Reversed Polish Notation (or postfixed notation): 3 4 2 × 1 5 − 2 3 ^ ^ ÷ +
  • input with abstract variable names with repetition: -a +2 *a +5*(alpha - beta +b *c +a)/alpha -(-beta)^c

Implementation:

SQL Unit

Using WideWorldImporters Sample Database for SQL Server, the 5 hours exam consists of 5 queries intended to produce table results that should perfectly match the provided ones.

Foundation and Advanced Statistics for Machine Learning Unit

Two intensive learning units focusing on classical statistical modeling methods. The foundation part concentrates on discrete and continuous probability distribution, density functions, univariate linear regression and hypothesis testing.

The Advanced part of the learning module addresses advanced statistical methods including multivariate linear models, Analysis of Variance, Model Regularization methods (Lasso & Ridge), Model selection methods, Decision trees, bagging and boosting, Random Forests, Dimensionality reduction using Principal Component Analysis.

Survival Analysis Unit

The produced report, written in R, attempts to analyze two Birth spacing datasets in the context of Event History Analysis, a field that borrows a lot of the Clinical Survival Analysis techniques to study sociological and historical phenomena. Both datasets are provided by the Medical Birth Registry of Norway. The first dataset describes the first to second birth spacing. The second dataset, in the same spirit, describes the second to third birth spacing. The two dataset does not share exactly the same covariates. We will try to study and highlight what factor is influential for the observed birth spacing.

In the first section, we will import, clean the two datasets and introduce the embedded covariates. In the second section, we perform some exploratory analysis that will guide us in the remaining report using Kaplan–Meier estimator. In the third section logrank test for groups will help us discover the first influential categorical factors. the next section, Cox Proportional hazard modeling will help us select the significant covariate and quantify their effect size. The conclusion will be the occasion to list our findings.

Metaheuristic Optimization Unit

A learning module focusing on Metaheuristic Optimization Algorithms. Studied Algos include: Simulated Annealing, Genetic Algorithm, Ant Colony Optimization, Particle Swarm, etc ... In the report, given the Paris Metro Map, we solved a Travel Salesman Problem Variation by expressing it as Hamiltonian Path finding problem. The slides expose our approach, the R packages used and the obtained results.

Semantic Web Unit

Semantic Web learning unit is all about the data formats and the technologies to enable it like RDF, N-3, Turtle, How to query these data using SPARQL, and how to expand the vocabulary using OWL ontology.

Agent-Based Modeling Unit

A fun project learning multi-agent environment modeling using Netlogo language.

Intro To Deep Learning Unit

Foundations of modern Deep learning. In the module assignment a Mask R-CNN pretrained model is used in inference mode. This project set the stage for a more ambitious deep dive in Deep learning modeling for a semantic segmentation project using Tensorflow 2.x in my internship at Airbus.

About

All courses assignements, exams and projects done during the year study at Data ScienceTech Institute (DSTI).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published