Releases · BaguaSys/bagua · GitHub

08 Jul 10:07

NOBLES5E

v0.6.3

Features

support different ssh port on different nodes (#93) 6810245
support multiple models in one training script (#113) 312bcc0 (#107) 0aec789

Fixes

autotune service defaults with a fixed random seed (#117) a58c2de

Others

sort q_adam variables for better performance (#102) f277549
improve autotune speed metrics measurement for better accuracy (#86) e4ee5ee
install.sh upgrades existing bagua package bc69890
install.sh will not install Rust if already exist on the system 67e1efe

Assets 2

02 Jul 10:13

NOBLES5E

v0.6.2

Fixes

fix QAdam gradient is not BaguaTensor during first stage 1d4dc82

Assets 2

02 Jul 03:19

NOBLES5E

v0.6.1

Features

add QAdam algorithm (#92) 0dafd24
broadcast model parameters on every algorithm reset e5b36dc
wrap python op in communication stream context by default 51eb656
add append op methods to python BaguaBucket class (#87) 84d8cbc

Fixes

BaguaBucket.tensors should only contain original passed in tensors c4ff05f
fix append python op callable reference 04019cc
fix BaguaBacket.clear_ops() return value 8cb9f54

Assets 2

01 Jul 07:20

NOBLES5E

v0.6.0

⚠ BREAKING CHANGE

Now end users should use model.with_bagua(...) API to use Bagua for communication. Algorithm developers can use bagua.torch_api.algorithms.Algorithm to easily develop new algorithms. Installation requires bagua-core >=0.3 now.

Features

add algorithm import in bagua.torch_api ee73edc
support reduction op and reduce ac8632c
auto installation support centos (#50) 073a59e

Fixes

fix algoirthm pre forward hook not returned e6c7c8d
the environment variable LOCAL_SIZE has been renamed in LOCAL_WORLD_SIZE (#51) 801b25a

Assets 2

25 Jun 09:21

NOBLES5E

v0.5.0

⚠ BREAKING CHANGE

contrib: load balancing dataloader and fused optimizer are now in bagua.torch_api.contrib module
baguaelastic/distributed/launch.py now moved to bagua/distributed/run.py

Features

add dependency installation script for ubuntu (#41) 4d820ab
Elastic training (#31) 1a5964c
add broadcast_buffer in bagua_init (#29) e761cc6
support bagua-core 0.2 (#26) f1d2bfa

Fixes

autotune: fix bucket size switch not effective (#48) 30b490a
remove logging in load balancing dataloader to avoid deadlock (#35) e900383
torch_api.distributed: cycle dependence (#16) 0314e24
fix setup.py for low version setuptools (#14) 7d315c0
fix baguaelastic launch script b069cd4=

Assets 2

17 Jun 12:23

NOBLES5E

v0.4.0

Initial public release of Bagua 🎆

Assets 2