Consider model ranking #38

jphall663 · 2023-05-31T13:39:07Z

It would be nice to have explicit model ranking for selection. I.e., something that answers the question of which is the "best" model between a group of trained models without human-eyeballing (of course human eye-balling is also great!). This would be in addition to Pareto-based selection, not a replacement for Pareto-based selection.

Consider Caruana et al. 2004 "b" - https://dl.acm.org/doi/10.1145/1046456.1046470.

I have a prototype here: https://jphall663.github.io/GWU_rml/, code: https://nbviewer.org/github/jphall663/GWU_rml/blob/master/assignments/eval.ipynb.

In addition to prototype, would be really cool for users to be able to:

select the number and type of assessments, e.g., 3 assessments: AUC, max. ACC, and AIR (gets at balancing real-world selection criteria)
for users to choose between random folds and user-selected segments (gets at weakspots and robustness)
for users to be able to perturb folds or data segments (gets at robustness)

(The current prototype is fixed at 5 folds, fixed with five quality assessment stats (no AIR, etc.), and does not perturb folds.)

Let me know if you'd like to discuss.

ZebinYang · 2023-06-02T01:43:03Z

That makes sense. We may provide an enhanced leaderboard panel that adopts the model ranking and segmented metrics in future releases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider model ranking #38

Consider model ranking #38

jphall663 commented May 31, 2023

ZebinYang commented Jun 2, 2023

Consider model ranking #38

Consider model ranking #38

Comments

jphall663 commented May 31, 2023

ZebinYang commented Jun 2, 2023