GitHub

This is the official code repository for the paper GCondenser: Benchmarking Graph Condensation. GCondenser is a graph condensation (GC) toolkit, designed for the graph condensation field 🚀. It benchmarks existing GC methods, accelerates the development of your own GC methods, and can be directly used for downstream applications such as graph continual learning. The master branch is the DGL implementation, and the pyg branch is for the PyG implementation.

NOTE: If you prefer PyG, please check out our PyG branch. Both DGL and PyG are supported.

Benchmark Pipeline

GCondenser standardises the graph condensation paradigm, consisting of condensation, validation and evaluation as shown in the following figure.

Get Started

GCondenser mainly depends on the following packages:

dgl  # If you prefer torch_geometric, please check out our pyg branch.
torch
pytorch-lightning
ogb  # fetech ogb datasets
hydra  # configuration management
hydra_colorlog  # hydra plugin for improved log
rootutils
rich

Some packages are optional if you would like to use some advanced features:

wandb  # wandb logger
hydra-optuna-sweeper  # hyperparameter search using optuna

Usage of `GCondenser`

The main configuration file is ./config/train.yaml, which includes settings for the dataset, condenser, trainer, and logger. GCondenser conducts various experiments using Hydra. The files located in the ./config/experiment/ folder are used to set dataset and condenser information. To quickly run an experiment, for example, use the following command:

python graph_condenser/train.py experiment=arxiv_gcond

If you would like to change the default hyperparameters, you can either directly modify the configuration file or pass them via the CLI. For example, to change the learning rate for updating the condensed graph's features to 0.01, run:

python graph_condenser/train.py experiment=arxiv_gcond condenser.opt_feat.lr=1e-2

For more information, please check the Hydra documentation.

Supported Datasets

For a list of supported datasets, please refer to our supported datasets documentation. We are continuously adding more public datasets.

Condenser

To effectively use GCondenser, you may need to lookup the following parameters of the Condenser class.

Item	Description	Config Key
NPC	`GCondenser` provides three node-per-class (NPC) initialisation methods: `original`, `balanced`.	condenser.labe_distribution
initialisation	Node features of condensed graph can be initialised by `noise`, `random` or `kCenter`	condenser.init_method
train model	The backbone model for condensing the original graph	condenser.gnn
validate model	The model trained with condensed graph in the validation step	condenser.validator
test model	The model trained with condensed graph in the test step	condenser.tester

Add a New Graph Condenser

You can easily add new graph condensers by creating a new class that inherits from graph_condenser.models.condenser.Condenser. In this new class, you will need to implement a training_step() method to define how the condensed graph should be updated each epoch. Please check out our step-by-step guide for adding a new method.

Hyperparamter Sweep by Optuna

Create a file in the ./configs/hparams_search/ directory. For example, there is a file named adj_feat_optuna.yaml. To run a hyperparameter sweep, execute the following command:

python graph_condenser/train.py experiment=arxiv_gcond hparams_search=adj_feat_optuna

For more information, please refer to the Optuna Sweeper Plugin for Hydra.

Reproducible results of the paper

To replicate the performance of the GCond method on the ogbn-arxiv dataset with the first budget using the SGC backbone model, run the following script with the appropriate flags:

bash scripts/experiment.sh -d arxiv -b 1 -m sgc -c gcond

This script initiates an Optuna sweep process to find the optimal learning rates for the adjacency matrix and features.

Cite

If you find this repo useful, please cite

@article{GCondenser,
  author    = {Yilun Liu and
               Ruihong Qiu and
               Zi Huang},
  title     = {GCondenser: Benchmarking Graph Condensation},
  journal   = {CoRR},
  volume    = {abs/2405.14246},
  year      = {2024}
}

Acknowledgement

We are deeply grateful to the following repositories, which have been immensely helpful in the development of this benchmark:

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
benchmark		benchmark
cgl		cgl
configs		configs
docs		docs
figs		figs
graph_condenser		graph_condenser
scripts		scripts
.project-root		.project-root
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmark Pipeline

Get Started

Usage of `GCondenser`

Supported Datasets

Condenser

Add a New Graph Condenser

Hyperparamter Sweep by Optuna

Reproducible results of the paper

Cite

Acknowledgement

About

Releases

Packages

Languages

License

superallen13/GCondenser

Folders and files

Latest commit

History

Repository files navigation

Benchmark Pipeline

Get Started

Usage of GCondenser

Supported Datasets

Condenser

Add a New Graph Condenser

Hyperparamter Sweep by Optuna

Reproducible results of the paper

Cite

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Usage of `GCondenser`

Packages