Skip to content
/ EMPL Public

The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression

Notifications You must be signed in to change notification settings

FloList/EMPL

Repository files navigation

The Earth Mover's Pinball Loss:
Quantiles for Histogram-Valued Regression

This is the Tensorflow implementation of the paper The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression (ICML 2021, [http://proceedings.mlr.press/v139/list21a/list21a.pdf]). The Earth Mover's Pinball Loss (EMPL) is a loss function for Deep Learning-based histogram regression, which incorporates cross-bin information and yields distributions over plausible histograms, expressed in terms of quantiles of the cumulative histogram in each bin. The EMPL compares two (normalised) histograms and as

where and are the cumulative histograms. Here, is the quantile level of interest. For the particular case of the median (), the EMPL reduces to the Earth Mover's Distance (or 1-Wasserstein distance) between two 1D histograms (e.g., Ramdas, Trillos & Cuturi 2017). Therefore, the EMPL is an asymmetric generalisation of the Earth Mover's Distance that enables the regression of arbitrary quantiles of the cumulative histogram in each bin (conditional on some input) by harnessing the idea of the pinball loss (e.g., Koenker & Bassett 1978).

Author: Florian List (Sydney Institute for Astronomy, School of Physics, A28, The University of Sydney, NSW 2006, Australia).

For any queries, please contact me at florian dot list at sydney dot edu dot au.

Overview

Toy example (histograms generated by drawing numbered balls from an urn)

  • Toy_example.py: trains / loads the neural network for the toy example and generates the plots in the manuscript.
  • Simulate_urn_draws.py: simulates drawing from the urn and compares the numerical results with the analytical solution.
  • EMD_for_single_draw.py: computes the expected EMD between the median / mean and the outcome for a single draw.

Bimodal example (distribution of cumulative histograms in each bin is bimodal)

  • Bimodal_example.py: trains / loads the neural network for the bimodal example.
  • Bimodal_example_make_plots.py: generates the plots for the bimodal example.

Bundesliga example (histograms of the league table position after every week)

  • Bundesliga_example.py: trains / loads the neural network for the Bundesliga example and generates the plots.
  • make_bundesliga_table.py: generates the training and testing datasets from the match results in Bundesliga_Results.csv.
  • make_leave_one_out_hists.py: computes the bootstrapping uncertainties by "replaying" the seasons.

NOTE: The file Bundesliga_Results.csv needs to be downloaded from Kaggle (contains Bundesliga results from 1993/94 to 2017/18).

Astrophysical example (estimating brightness histograms from γ-ray photon-count maps)

The astrophysical example can be found in this repository.

Citation

If you find this code or the paper useful, please consider citing

@inproceedings{List2021,
                           author = {List, Florian},
                           booktitle = {Proceedings of the 38th International Conference on Machine Learning},
                           title = {{The Earth Mover’s Pinball Loss: Quantiles for Histogram-Valued Regression}},
                           url = {https://arxiv.org/pdf/2106.02051.pdf},
                           year = {2021},
                           archiveprefix = {arXiv},
                           arxivid = {2106.02051}
                           }

About

The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published