Skip to content

DataGossip is an extension for asynchronous distributed data parallel machine learning that improves the training on imbalanced partitions.

Notifications You must be signed in to change notification settings

HPI-Information-Systems/DataGossip

Repository files navigation

DataGossip

DataGossip [paper] is an extension for asynchronous distributed data parallel machine learning that improves the training on imbalanced partitions.

Installation

requires conda:

$ conda env create -f environment.yml
$ conda activate datagossip
$ python setup.py install

Experiment Reproducibility

Download and transform the datasets on your main machine:

$ python prepare_datasets.py

Then, run the following script on each cluster node to start the training. Be aware to set the right ranks and sizes!

$ python experiments/train.py --rank=<rank> --size=<size> --main_address=<main_address> 

Afterwards, you can find the results of the experiment in the files (on your machine with rank=0) experiments.pkl and evaluations.pkl which hold pandas DataFrames.

Reference

Please consider citing:

@inproceedings{wenig2022datagossip,
  title={DataGossip: A Data Exchange Extension for Distributed Machine Learning Algorithms},
  author={Wenig, Phillip and Papenbrock, Thorsten},
  booktitle={Proceedings of the International Conference on Extending Database Technology (EDBT)},
  year={2022},
  pages={373--377},
  doi={10.48786/edbt.2022.24},
  url={http://dx.doi.org/10.48786/edbt.2022.24},
}

About

DataGossip is an extension for asynchronous distributed data parallel machine learning that improves the training on imbalanced partitions.

Topics

Resources

Stars

Watchers

Forks