Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider model ranking #38

Open
jphall663 opened this issue May 31, 2023 · 1 comment
Open

Consider model ranking #38

jphall663 opened this issue May 31, 2023 · 1 comment

Comments

@jphall663
Copy link

It would be nice to have explicit model ranking for selection. I.e., something that answers the question of which is the "best" model between a group of trained models without human-eyeballing (of course human eye-balling is also great!). This would be in addition to Pareto-based selection, not a replacement for Pareto-based selection.

Consider Caruana et al. 2004 "b" - https://dl.acm.org/doi/10.1145/1046456.1046470.

I have a prototype here: https://jphall663.github.io/GWU_rml/, code: https://nbviewer.org/github/jphall663/GWU_rml/blob/master/assignments/eval.ipynb.

In addition to prototype, would be really cool for users to be able to:

  • select the number and type of assessments, e.g., 3 assessments: AUC, max. ACC, and AIR (gets at balancing real-world selection criteria)
  • for users to choose between random folds and user-selected segments (gets at weakspots and robustness)
  • for users to be able to perturb folds or data segments (gets at robustness)

(The current prototype is fixed at 5 folds, fixed with five quality assessment stats (no AIR, etc.), and does not perturb folds.)

Let me know if you'd like to discuss.

@ZebinYang
Copy link
Collaborator

That makes sense. We may provide an enhanced leaderboard panel that adopts the model ranking and segmented metrics in future releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants