provide re-ranking runs from datasets #262

seanmacavaney · 2022-01-11T11:07:28Z

It's becoming increasingly common for datasets to include a standard initial result set for re-ranking experiments. ir_datasets provides these, but they are a bit inconvenient to use in pyterrier right now because it involves using irds_ref, building an appropriate dataframe, and merging in the query/document text.

This PR exposes the functionality more conveniently through dataset.get_reranking_run(). (I'm open to alternative names here.)

I also considered including the document text, but it's not always going to be needed and it seems more suitable to use the existing pt.text.get_text() pipeline for this.

seanmacavaney · 2022-01-11T11:08:34Z

If this looks good, I'll add documentation for it too.

cmacdonald · 2022-01-11T11:22:22Z

Thanks. A smaller name would be appreciated. Why not just get_run()?

We need to discuss timing. I wasnt planning to release 0.8 until March, i.e. after IR(H)

Can we fix #212 in 0.8 as well.

cmacdonald · 2022-01-11T11:24:29Z

or get_results()? As I dont think we use the word run in PyTerrier.

seanmacavaney · 2022-01-11T11:55:33Z

get_results seems a bit generic, but it'll probably work.

I don't think this urgently needs to hit pypi. Sorry I lost track of #212-- I'll fix it as well.

re-ranking run

59ece69

cmacdonald added this to the 0.8 milestone Jan 11, 2022

seanmacavaney and others added 2 commits January 11, 2022 11:59

reranking_run -> results

c00f849

Merge branch 'master' into reranking_run

1312bbc

cmacdonald approved these changes Jan 11, 2022

View reviewed changes

cmacdonald merged commit 8a9a455 into terrier-org:master Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provide re-ranking runs from datasets #262

provide re-ranking runs from datasets #262

seanmacavaney commented Jan 11, 2022

seanmacavaney commented Jan 11, 2022

cmacdonald commented Jan 11, 2022

cmacdonald commented Jan 11, 2022

seanmacavaney commented Jan 11, 2022

provide re-ranking runs from datasets #262

provide re-ranking runs from datasets #262

Conversation

seanmacavaney commented Jan 11, 2022

seanmacavaney commented Jan 11, 2022

cmacdonald commented Jan 11, 2022

cmacdonald commented Jan 11, 2022

seanmacavaney commented Jan 11, 2022