Release v0.8.0 · BaguaSys/bagua

[0.8.0] - 2021-09-26

Bug Fixes

Ci

Only run publish once on git tag

Core

Fix compressed buffer can not be scattered to odd number of ranks

Other

Fix ci pypi versioning
Remove init.py and python version, use cargo version
Move import bagua_install_library to install library function
Merge bagua_install_library and setup.py, remove nccl<=2.6 support
Fix alltoall_v parameter (#17)
Reduce and allgather python interface
Fix decompress incorrect pointer and typo in error msg
Fix python gil deadlock during getting data ptr
Fix benchmark script requirements
Fix alltoall_v parameter types (#27)
Always mark bagua padding tensor as ready
Make compress/decompress of BaguaTensor method string consistent (#33)
Fix scatter and reduce_scatter implementation (#40)
Substract overflow error for decentralized op (#39)
Fix QADAM params (#17)
Fix assert precision (#18)
Replace mutex with atomic bool for async op and add Aluminum submodule update (#67)
Fix duplicated dependency downloading during installation (#77)
Fix async algorithm aborting and hanging (#78, #81)
Fix qadam algorithm call (#20)
Fix missing symbols in the zip library (#24)
Fix random autotune server hang (#206)
Bagua-net library path mismatch, make --enable_bagua_net argument style consistent with other args (#218)

Python

Fix random autotune-service hang
Handle conflicts caused by sklearn upgrade (#225)

Features

Ci

Only publish pypi for master commits

Other

Add async model average algorithm (#110)
Add cached dataset wrapper (#148)
Support sync batchnorm (#151)
Add --enable-bagua-net option in launcher (#183)
Add pytorch examples for MNIST, ImageNet, SQuAD training (#1)
Add requirements.txt, only download dataset on local rank 0 (#2)
Add python packaging related files
Add __version__ variable
Install nccl deps in bagua core and add generated __version__ variable
Add version.py placeholder to prevent file not found error
Initial support for python op (#2)
Add 5 min timeout for buckets' comm op (#5)
Replace NCCL with Aluminum (#7)
Add synethetic benchmark script (#5)
Add elastic training example (#7)
Support alltoall_v (vector alltoall) (#14)
Add reduce and allgather python interface
Support reduce and allgather op with Reduction op enum
Support creating BaguaTensor by passing torch tensor directly (#19)
Compatible mode for getting pytorch tensor info with Python interpreter
Better debug log including tensor info when executing ops
Add native low precision decentralized operator (#26)
Add (scatter, gather, scatter_reduce) and all inplace version communication primitives (#37)
Make full precision decentralized op stateless (#36)
Add communication_primitives example (#12)
Use nccl 2.10 avg op for all algorithms using averaging (#46, #45)
Add opentelemetry to report tensor ready order (#42)
Add deterministic flag (#15)
Add native async model average algorithm (#41)
Add examples for async model average algorithm (#14)
Support packet splitting and multi-stream parallel transmission (#5)
Support ncclnet v3 and remove the dependency on nccl in the installation environment (#17)
Add sync interval param to async examples (#19)
Suppport tokio backend (#21)
Support bagua-net (#89)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.0

[0.8.0] - 2021-09-26

Bug Fixes

Ci

Core

Other

Python

Features

Ci

Other