Skip to content

Showcase full ML lifecycle management. EDA, model training, deployment and management

Notifications You must be signed in to change notification settings

JenAlchimowicz/YouTube-Trending-MLops

Repository files navigation

YouTube Trending - MLOps

What is this repo about ❔

This repo shows how to deploy and manage machine learning models in production.

Steps covered:

  1. Define our problem and perform EDA
  2. Develop an ETL pipeline
  3. Train a model
  4. Deploy the model to cloud
  5. Develop and deploy a retraining pipeline
  6. Monitor the model performance

The focus is on the tools and ML best practices. In particular, dockerizing and deploying to AWS the two key pipelines: retraining and inference. The problem itself - predicting YouTube views from just the channel name and video category - is rather trivial, and would usually be more complex in the real world. However, the methods of managing the ML lifecycle are very relevant and can be used to deploy real-world projects.

Inference endpoint available at: mlprojectsbyjen.com

-----------------------------------------------------

📖 Table of contents

  1. ➤ Inference pipeline
  2. ➤ Retraining pipeline
  3. ➤ Repo structure

-----------------------------------------------------

📝 Inference pipeline

Capabilities

Inference pipeline consists of two components: web endpoint and prediction API. The web endpoint is resposible for the user interface. Prediction API is resonsible for accepting requests from the web endpoint and responding with the predictions made by ML model. The components are separated using Elastic Load Balancers (ELB). Each component is wrapped in a docker container, deployed using Elastic Container Service (ECS) and placed in an Auto Scaling Group (ASG), allowing for quick scalability. All the services are spread across 3 Availability Zones (AZ) ensuring high availability.

AWS insfrastructure

The architecture follows a simple 2-tier design. The traffic flows from users to the external Application Load Balancer (ALB), which is then distributed across Elastic Container Service (ECS) Tasks. When the user presses predict on the web app, a request is sent to the internal ALB. The App tier Tasks compute the ML prediction and return in back to the Web tier, where the results are displayed back to the user.



* Why is the App tier public? Because NAT Gateways are expensive for a small project such as this one - around 40$ per month per AZ. There are no security concerns so making the App Tier public seems most reasonable.

** In reality there are 3 AZs configured

*** Depending on when you are reading this, the endpoind mlprojectsbyjen.com might actually use a monolith deployment instead of a 2-tier architecture. It doesn't scale that well but allows for less Tasks to be running which cuts costs.

Tools

The app itself uses standard ML python libraries: Pandas, scikit-learn, XGBoost, FastAPI and Streamlit. Neptune AI is used for experiment tracking and as a model registry.

AWS Service choices:

  • Compute - ECS for ease of deployment
  • Storage - S3 for scalability and AWS integrations
  • Feature Store - DynamoDB for quick read access
  • Scaling and High Availability - ALB and ASG as they are the recommended standard in AWS
  • Access and security - IAM Roles for AWS access and SSM Parameter Store for distributing keys for external services such as Neptune AI

-----------------------------------------------------

📝 Retraining pipeline

Capabilities

In progress...

AWS insfrastructure

In progress...

Tools

In progress...

-----------------------------------------------------

📝 Repo structure

In progress...

About

Showcase full ML lifecycle management. EDA, model training, deployment and management

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published