Skip to content
/ ibbsc Public

Information Bottleneck for Deep Learning: Theory and Experiments

Notifications You must be signed in to change notification settings

lrnq/ibbsc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ibbsc

This is the repo for my BSc thesis Information Bottleneck for Deep Learning: Theory and Experiments. Relevant information about the code can be seen below. Briefly it allows you to train FNNs with various activation functions and estimate the mutual information between the hidden layers and the input and output using either an adaptive or uniform binning strategy. There are functions for plotting relevant quantities, such as the information plane seen below.

Setup

The following setup guide has been tested on Manjaro Linux 20.0.3 Lysia.
I recommend installing Miniconda3 on your system so you have Conda. If you have Conda installed on your system you can use the YAML file, ibbsc.yml, to setup an enviroment with the requirements by running:

conda env create -f ibbsc.yml

You can activate the enviroment with:

conda activate ibbsc

Usage

The section below assumes that python is linked to the correct version (it will be inside the Conda enviroment).

Usage:  
    cd ibbsc
    python main.py [Parameters]
  
Parameters:  
    -h    --help                        Prints help similar to this.
    -a    --activation                  Sets the activation function for the hidden layers, but the last. Can be tanh,relu,elu or relu6. See `ibbsc/models.py` for list of supported functions as well.
    -bs   --batch_size                  Batch size for training.   
    -d    --data                        Path to the data file used for training.
    -lr   --learning_rate               Learning rate for Adam optimizer.
    -sp   --save_path                   Path to the folder for saving data.
    -e    --epochs                      Number of epochs to train on per run.  
    -num  --num_runs                    Number of times to run the network.
    -mi   --mi_methods                  List of method(s) for estimating the mutual information.
    -g    --try_gpu                     Try to train on a GPU. 
    -nb   --num_bins                    List of number of bins to use for discretization. 
    -ls   --layer_sizes                 List of the layers sizes of the network. 
    -pr   --plot_results                Generate plot of the information plane of the data just generated and save plots in --save_path .
    -sm   --save_max_vals               Save max values for each layer at each epoch.
    -ste  --save_train_error            Save training error as a function of the epochs for each run.
    -smi  --save_mutual_information     Save mutual information after each epoch.
    -sf   --start_from                  Start running experiment from a specific run number.
    -ts   --test_size                   Size of the test data.

For the default parameter values see ibbsc/default_params.py

Examples

Running one of these examples takes roughly 15-25 hours (dependent on the example) on a decent desktop.

To generate the information plane averaged over 40 runs for a network with layer sizes [12,10,7,5,4,3,2] (default network) the tanh activation function and the mutual information estimated using:

  1. The adaptive binning strategy:
cd ibbsc  
python main.py -a=tanh -bs=256 -lr=0.0004 -sp="../data/saved_data" -num=40 -mi="[adaptive]" -nb="[30]" -pr=1 
  1. The adaptive binning and the fixed binning strategy:
cd ibbsc  
python main.py -a=tanh -bs=256 -lr=0.0004 -sp="../data/saved_data" -num=40 -mi="[fixed,adaptive]" -nb="[30]" -pr=1
  1. The adaptive binning and the fixed binning strategy for multiple number of bins:
cd ibbsc  
python main.py -a=tanh -bs=256 -lr=0.0004 -sp="../data/saved_data" -num=40 -mi="[fixed,adaptive]" -nb="[30,100]" -pr=1 

About

Information Bottleneck for Deep Learning: Theory and Experiments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published