Skip to content

BR rerank

Cheng Li edited this page Jun 15, 2020 · 34 revisions

This page provides the code and data associated with the paper

Learning to Calibrate and Rerank Multi-label Predictions.

Cheng Li, Virgil Pavlu, Javed Aslam, Bingyu Wang, and Kechen Qin.

In ECML-PKDD, 2019.


The BR-Rerank method proposed in the paper is a two stage multi-label classification algorithm:

  1. Use BR to estimate the probability of each label independently and generate top-K set prediction candidates with highest scores
  2. Extract features from set candidates which capture label dependencies and apply a second calibrator model to rescore and rerank set candidates

Compared to standard BR, BR-rerank provides

  • Higher classification accuracy
  • Better calibrated prediction confidence scores

The picture above illustrates how BR-rerank makes predictions on the input test image. The "marginal" column shows the individual label probabilities estimated by BR. Note that the label "baseball glove" has a probability below the 0.5 threshold, and therefore will not be included in BR's predictions. The "set prediction candidates" column shows the top-5 set prediction candidates with the highest BR scores generated by dynamic programming based on BR marginals. The "set prediction features" column shows, for each set candidate, its BR score, its binary encoding, its cardinality and its prior probability. The "reranker score" column shows the calibrated BR-rerank confidence score for each set prediction candidate. For this image, BR predicts the incorrect set {"person", "baseball bat"} with confidence 0.58. BR-rerank predicts the correct set {"person", "baseball bat", "baseball glove"} with confidence 0.17.

Download code and data

Pre-compiled code can be downloaded from the pyramid package release page. Datasets and properties files can be downloaded here. After downloading, please unzip all the files.

Reproduce the results reported in the paper

To reproduce the calibration result on RCV1 dataset (reported in Table 2 in the paper), first edit the first two lines of the calibration_exps/rcv1.properties file. Set the dataPath and outputDir to the proper (absolute) paths on your local computer. Then run pyramid with this properties file:

./pyramid-0.12.9/pyramid calibration_exps/rcv1.properties

(Note that you may need to change the version number if you are using a different version of pyramid.)

It will start training BR and GB calibrator and then report calibration performance.

You can do the same for other datasets as well using their properties files. The hyper parameters in each properties file have been set to be the ones tuned on validation set.


To reproduce the reranking classification result on RCV1 dataset (reported in Table 4 in the paper), first edit the first two lines of the rerank_exps/rcv1.properties file. Set the dataPath and outputDir to the proper (absolute) paths on your local computer. Then run pyramid with this properties file:

./pyramid-0.12.9/pyramid rerank_exps/rcv1.properties

It will start training BR and GB calibrator and then report reranking performance.

You can do the same for other datasets as well.


To increase memory allocation so that the code can run on larger datasets:

Open pyramid-0.12.9/pyramid with a text editor and change -Xmx10g to -Xmx100g if you want to allocate 100g memory, for instance.

Source Code

The source code associated with BR-Rerank can be found here. Note that BR is implemented as a CBM with 1 component.

Questions

Feel free to contact me (chengli.email@gmail.com) if you have any questions about the paper or the code.

Clone this wiki locally