Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide re-ranking runs from datasets #262

Merged
merged 3 commits into from
Jan 11, 2022

Conversation

seanmacavaney
Copy link
Collaborator

It's becoming increasingly common for datasets to include a standard initial result set for re-ranking experiments. ir_datasets provides these, but they are a bit inconvenient to use in pyterrier right now because it involves using irds_ref, building an appropriate dataframe, and merging in the query/document text.

This PR exposes the functionality more conveniently through dataset.get_reranking_run(). (I'm open to alternative names here.)

I also considered including the document text, but it's not always going to be needed and it seems more suitable to use the existing pt.text.get_text() pipeline for this.

@seanmacavaney
Copy link
Collaborator Author

If this looks good, I'll add documentation for it too.

@cmacdonald cmacdonald added this to the 0.8 milestone Jan 11, 2022
@cmacdonald
Copy link
Contributor

Thanks. A smaller name would be appreciated. Why not just get_run()?

We need to discuss timing. I wasnt planning to release 0.8 until March, i.e. after IR(H)

Can we fix #212 in 0.8 as well.

@cmacdonald
Copy link
Contributor

or get_results()? As I dont think we use the word run in PyTerrier.

@seanmacavaney
Copy link
Collaborator Author

get_results seems a bit generic, but it'll probably work.

I don't think this urgently needs to hit pypi. Sorry I lost track of #212-- I'll fix it as well.

@cmacdonald cmacdonald merged commit 8a9a455 into terrier-org:master Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants