Skip to content

Releases: SeldonIO/MLServer

1.6.1

10 Sep 16:06
Compare
Choose a tag to compare

Overview

Features

MLServer now offers an option to use pre-existing Python environments by specifying a path to the environment to be used - by @idlefella in (#1891)

Releases

MLServer released catboost runtime which allows serving catboost models with MLServer - by @sakoush in (#1839)

Fixes

What's Changed

New Contributors

Full Changelog: 1.6.0...1.6.1

1.6.0

26 Jun 14:07
Compare
Choose a tag to compare

Overview

Upgrades

MLServer supports Pydantic V2.

Features

MLServer supports streaming data to and from your models.

Streaming support is available for both the REST and gRPC servers:

  • for the REST server is limited only to server streaming. This means that the client sends a single request to the server, and the server responds with a stream of data.
  • for the gRPC server is available for both client and server streaming. This means that the client sends a stream of data to the server, and the server responds with a stream of data.

See our docs and example for more details.

What's Changed

New Contributors

Full Changelog: 1.5.0...1.6.0

1.5.0

05 Mar 14:40
8a48f5b
Compare
Choose a tag to compare

What's Changed

  • Update CHANGELOG by @github-actions in #1592
  • build: Migrate away from Node v16 actions by @jesse-c in #1596
  • build: Bump version and improve release doc by @jesse-c in #1602
  • build: Upgrade stale packages (fastapi, starlette, tensorflow, torch) by @sakoush in #1603
  • fix(ci): tests and security workflow fixes by @sakoush in #1608
  • Re-generate License Info by @github-actions in #1612
  • fix(ci): Missing quote in CI test for all_runtimes by @sakoush in #1617
  • build(docker): Bump dependencies by @jesse-c in #1618
  • docs: List supported Python versions by @jesse-c in #1591
  • fix(ci): Have separate smaller tasks for release by @sakoush in #1619

Notes

  • We remove support for python 3.8, check #1603 for more info. Docker images for mlserver are already using python 3.10.

Full Changelog: 1.4.0...1.5.0

1.4.0

28 Feb 15:39
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.3.5...1.4.0

1.3.5

10 Jul 10:28
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.3.4...1.3.5

1.3.4

21 Jun 16:21
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.3.3...1.3.4

1.3.3

05 Jun 10:05
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.3.2...1.3.3

1.3.2

10 May 13:47
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.3.1...1.3.2

1.3.1

27 Apr 10:54
Compare
Choose a tag to compare

What's Changed

  • Move OpenAPI schemas into Python package (#1095)

1.3.0

27 Apr 08:00
Compare
Choose a tag to compare

WARNING ⚠️ : The 1.3.0 has been yanked from PyPi due to a packaging issue. This should have been now resolved in >= 1.3.1.

What's Changed

Custom Model Environments

More often that not, your custom runtimes will depend on external 3rd party dependencies which are not included within the main MLServer package - or different versions of the same package (e.g. scikit-learn==1.1.0 vs scikit-learn==1.2.0). In these cases, to load your custom runtime, MLServer will need access to these dependencies.

In MLServer 1.3.0, it is now possible to load this custom set of dependencies by providing them, through an environment tarball, whose path can be specified within your model-settings.json file. This custom environment will get provisioned on the fly after loading a model - alongside the default environment and any other custom environments.

Under the hood, each of these environments will run their own separate pool of workers.

image

Custom Metrics

The MLServer framework now includes a simple interface that allows you to register and keep track of any custom metrics:

  • [mlserver.register()](https://mlserver.readthedocs.io/en/latest/reference/api/metrics.html#mlserver.register): Register a new metric.
  • [mlserver.log()](https://mlserver.readthedocs.io/en/latest/reference/api/metrics.html#mlserver.log): Log a new set of metric / value pairs.

Custom metrics will generally be registered in the [load()](https://mlserver.readthedocs.io/en/latest/reference/api/model.html#mlserver.MLModel.load) method and then used in the [predict()](https://mlserver.readthedocs.io/en/latest/reference/api/model.html#mlserver.MLModel.predict) method of your custom runtime. These metrics can then be polled and queried via Prometheus.

image

OpenAPI

MLServer 1.3.0 now includes an autogenerated Swagger UI which can be used to interact dynamically with the Open Inference Protocol.

The autogenerated Swagger UI can be accessed under the /v2/docs endpoint.

https://mlserver.readthedocs.io/en/latest/_images/swagger-ui.png

Alongside the general API documentation, MLServer also exposes now a set of API docs tailored to individual models, showing the specific endpoints available for each one.

The model-specific autogenerated Swagger UI can be accessed under the following endpoints:

  • /v2/models/{model_name}/docs
  • /v2/models/{model_name}/versions/{model_version}/docs

HuggingFace Improvements

MLServer now includes improved Codec support for all the main different types that can be returned by HugginFace models - ensuring that the values returned via the Open Inference Protocol are more semantic and meaningful.

Massive thanks to @pepesi for taking the lead on improving the HuggingFace runtime!

Support for Custom Model Repositories

Internally, MLServer leverages a Model Repository implementation which is used to discover and find different models (and their versions) available to load. The latest version of MLServer will now allow you to swap this for your own model repository implementation - letting you integrate against your own model repository workflows.

This is exposed via the model_repository_implementation flag of your settings.json configuration file.

Thanks to @jgallardorama (aka @jgallardorama-itx ) for his effort contributing this feature!

Batch and Worker Queue Metrics

MLServer 1.3.0 introduces a new set of metrics to increase visibility around two of its internal queues:

Many thanks to @alvarorsant for taking the time to implement this highly requested feature!

Image Size Optimisations

The latest version of MLServer includes a few optimisations around image size, which help reduce the size of the official set of images by more than ~60% - making them more convenient to use and integrate within your workloads. In the case of the full seldonio/mlserver:1.3.0 image (including all runtimes and dependencies), this means going from 10GB down to ~3GB.

Python API Documentation

Alongside its built-in inference runtimes, MLServer also exposes a Python framework that you can use to extend MLServer and write your own codecs and inference runtimes. The MLServer official docs now include a reference page documenting the main components of this framework in more detail.

New Contributors