Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix LTR groupby for xgboost & lightgbm #284

Merged
merged 5 commits into from
Nov 10, 2022
Merged

fix LTR groupby for xgboost & lightgbm #284

merged 5 commits into from
Nov 10, 2022

Conversation

cmacdonald
Copy link
Contributor

No description provided.

@cmacdonald cmacdonald added this to the 0.8 milestone Mar 29, 2022
@cmacdonald cmacdonald mentioned this pull request Mar 29, 2022
@cmacdonald
Copy link
Contributor Author

test failure now addressed

pyterrier/ltr.py Outdated Show resolved Hide resolved
@cmacdonald
Copy link
Contributor Author

cmacdonald commented Mar 30, 2022

What Sean had identified was indeed a bug affecting xgboost or lightgbm integration. In training or validation sets with differing number of results per query, counts per query would not be correctly ordered, and hence documents could be associated to the wrong query by the learner. This fixes the issue through sorting of the resultset, and introduces a new unit test to check the issue.

In most LTR scenarios, there will be 1000 docs per query, but I'll examine differences in a realistic LTR setting.

pyterrier/ltr.py Outdated Show resolved Hide resolved
@cmacdonald cmacdonald changed the title change LTR groupby, addresses #283 fix LTR groupby for xgboost & lightgbm Mar 30, 2022
@cmacdonald cmacdonald modified the milestones: 0.8, 0.9 Apr 11, 2022
@cmacdonald cmacdonald merged commit 950e172 into master Nov 10, 2022
@cmacdonald cmacdonald deleted the ltr_groupby branch November 10, 2022 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants