Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_results merge topics #265

Merged
merged 2 commits into from
Jan 14, 2022
Merged

Conversation

seanmacavaney
Copy link
Collaborator

When reading results files for further processing (e.g., using a saved result file for further re-ranking), you usually will need an additional column that's not saved in the file: query. This can be accomplished by the user by loading the dataset's topics and performing a pd.merge() on the results and topics dataframes, but this is inconvinenet.

This PR provides several options to help the user load a results dataframe that includes topics with more ease. They can either:

  • Pass a dataset into read_results, either as a string identifier or as a Dataset object. The topics for this dataset will be loaded and merged with the results.
  • Pass topics into read_results. These topics will be merged with the results. This is handy if the user already has the topics handy, or if they do not come directly from a dataset.

Functionality would be ambiguous if both options were given, so these two options are exclusive. An assertion error will be thrown if both are provided by the user.

@cmacdonald
Copy link
Contributor

I think we need to add some examples of the various uses of read_results(), at least one using dataset and topics options?
We might want to add another example using pt.Experiment()?

@cmacdonald cmacdonald added this to the 0.8 milestone Jan 12, 2022
@cmacdonald cmacdonald merged commit 0c7a875 into terrier-org:master Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants