Skip to content

FedML 0.8.4

Compare
Choose a tag to compare
@FedML-AI-admin FedML-AI-admin released this 20 Jun 17:47
· 2409 commits to master since this release
23fbb84

What's Changed

New Features in 0.8.4

At FedML, our mission is to remove the friction and pain points of converting your ML & AI models from R&D into production-scale-distributed and federated training & serving via our no-code MLOps platform.
FedML is happy to announce our update 0.8.4. This release is filled with new capabilities, bug fixes, and enhancements. A key announcement is the launch of FedLLM for simplifying & reducing the costs associated with training & serving large language models. You can read more about it on our blog post.
Screenshot 2023-06-21 at 01 42 07

New Features

curl -XPOST http://localhost:40800/fedml/api/v2/disableAgent -d’{}'
curl -XPOST http://localhost:40800/fedml/api/v2/enableAgent -d’{}'
curl -XPOST http://localhost:40800/fedml/api/v2/queryAgentStatus -d’{}'

Bug Fixes

  • [CoreEngine] Create distinct device ids when running multiple Docker containers to simulate multiple clients or silos on one machine. Now using the product id plus a random id as the device id

  • [CoreEngine] Fixed a device assignment issue in get_torch_device in the distributed training mode.

  • [Serving] Fixed the exceptions that occurred when recovering at startup after upgrading.

  • [CoreEngine] Fixed the device id issue when running in the docker on MacOS.

  • [App] Fixed the issue in the app fedprox + sage graph regression and graph clf.

  • [App] Fixed an issue with the heart disease app failing when running in MLOps.

  • [App] Fixed an issue with the heart disease app’s performance curve

  • [App/Android] Enhanced Android starting/stopping mechanism and fixed the following issues:

Fixed status displays after stopping the run.
When stopping a Run during a round that has not finished, the MNN process will remain in IDLE state (it was previously going OFFLINE).
When stopping after a round is done, the training will now stop
Python server TAG in the logs is not correct. Now you can easily find the server mentioned in logs.

Enhancements

  • [Serving] Tested the inference backend and checked the response after the model deployment is finished.

  • [CoreEngine/Serving] Set the GPU option based on the availability of CUDA when running the inference backend, optimize the mqtt connection checking.

  • [CoreEngine] Stored model caches to the user home directory when running the federated learning.

  • [CoreEngine] Added the device id to the monitor message when processing inference request

  • [CoreEngine] Reported the runner exception and ignored exceptions when missing the bootstrap section in the fedml_config.yaml.