Skip to content

v0.8.0

Compare
Choose a tag to compare
@NOBLES5E NOBLES5E released this 26 Sep 13:21
· 361 commits to master since this release

[0.8.0] - 2021-09-26

Bug Fixes

Ci

  • Only run publish once on git tag

Core

  • Fix compressed buffer can not be scattered to odd number of ranks

Other

  • Fix ci pypi versioning
  • Remove init.py and python version, use cargo version
  • Move import bagua_install_library to install library function
  • Merge bagua_install_library and setup.py, remove nccl<=2.6 support
  • Fix alltoall_v parameter (#17)
  • Reduce and allgather python interface
  • Fix decompress incorrect pointer and typo in error msg
  • Fix python gil deadlock during getting data ptr
  • Fix benchmark script requirements
  • Fix alltoall_v parameter types (#27)
  • Always mark bagua padding tensor as ready
  • Make compress/decompress of BaguaTensor method string consistent (#33)
  • Fix scatter and reduce_scatter implementation (#40)
  • Substract overflow error for decentralized op (#39)
  • Fix QADAM params (#17)
  • Fix assert precision (#18)
  • Replace mutex with atomic bool for async op and add Aluminum submodule update (#67)
  • Fix duplicated dependency downloading during installation (#77)
  • Fix async algorithm aborting and hanging (#78, #81)
  • Fix qadam algorithm call (#20)
  • Fix missing symbols in the zip library (#24)
  • Fix random autotune server hang (#206)
  • Bagua-net library path mismatch, make --enable_bagua_net argument style consistent with other args (#218)

Python

  • Fix random autotune-service hang
  • Handle conflicts caused by sklearn upgrade (#225)

Features

Ci

  • Only publish pypi for master commits

Other

  • Add async model average algorithm (#110)
  • Add cached dataset wrapper (#148)
  • Support sync batchnorm (#151)
  • Add --enable-bagua-net option in launcher (#183)
  • Add pytorch examples for MNIST, ImageNet, SQuAD training (#1)
  • Add requirements.txt, only download dataset on local rank 0 (#2)
  • Add python packaging related files
  • Add __version__ variable
  • Install nccl deps in bagua core and add generated __version__ variable
  • Add version.py placeholder to prevent file not found error
  • Initial support for python op (#2)
  • Add 5 min timeout for buckets' comm op (#5)
  • Replace NCCL with Aluminum (#7)
  • Add synethetic benchmark script (#5)
  • Add elastic training example (#7)
  • Support alltoall_v (vector alltoall) (#14)
  • Add reduce and allgather python interface
  • Support reduce and allgather op with Reduction op enum
  • Support creating BaguaTensor by passing torch tensor directly (#19)
  • Compatible mode for getting pytorch tensor info with Python interpreter
  • Better debug log including tensor info when executing ops
  • Add native low precision decentralized operator (#26)
  • Add (scatter, gather, scatter_reduce) and all inplace version communication primitives (#37)
  • Make full precision decentralized op stateless (#36)
  • Add communication_primitives example (#12)
  • Use nccl 2.10 avg op for all algorithms using averaging (#46, #45)
  • Add opentelemetry to report tensor ready order (#42)
  • Add deterministic flag (#15)
  • Add native async model average algorithm (#41)
  • Add examples for async model average algorithm (#14)
  • Support packet splitting and multi-stream parallel transmission (#5)
  • Support ncclnet v3 and remove the dependency on nccl in the installation environment (#17)
  • Add sync interval param to async examples (#19)
  • Suppport tokio backend (#21)
  • Support bagua-net (#89)