Skip to content

una-honest-data-and-ai-consulting/ML-Online-Near-real-time-Serving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Online serving (near real-time)

Online inference is definitely more challenging than batch inference. Why? Due to the latency restrictions on our systems.

Online inference is about responding with a prediction to the request of the end user with a low latency.

What to optimize: latency

End user: usually interacts with a model directly available through an API

Validation: offline and online via A/B testing


Where to start

Learn MLOps general concepts:

Next learn how to build and run pipelines for online serving

on Azure cloud:

on AWS:

overall:


Next step: Advanced workshop: Azure Online Serving (near real-time)

This workshop is WIP

It will cover a real-life use case of deploying a machine learning model to Azure Functions with Python runtime and its troubleshooting.

About

[WIP] Advanced workshop covering ML Online serving near real-time on Azure

Topics

Resources

License

Stars

Watchers

Forks