-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval metrics #587
Eval metrics #587
Conversation
lilacheden
commented
Feb 20, 2024
- Standard api for disabling/enabling confidence interval computation of metric/metricpipeline
- make kendalltau and roc_auc metrics and not metric pipelines (to temporarily overcome the multiple metric pipeline issue)
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #587 +/- ##
==========================================
+ Coverage 87.78% 87.97% +0.18%
==========================================
Files 85 85
Lines 7170 7208 +38
==========================================
+ Hits 6294 6341 +47
+ Misses 876 867 -9 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! left some small comments ..
src/unitxt/metrics.py
Outdated
@@ -80,6 +80,14 @@ def new_random_generator(): | |||
def disable_confidence_interval_calculation(self): | |||
self.n_resamples = None | |||
|
|||
def disable_confidence_interval_calculation_return_n_resamples(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just return self.n_resamples
from the disable_confidence_interval_calculation
method?
that shouldn't affect the existing code at all, no?
src/unitxt/metrics.py
Outdated
kendall_results = self.kendalltau(references, predictions, variant=self.variant) | ||
corr = kendall_results.correlation | ||
return { | ||
self.main_score: corr, | ||
"p_val": kendall_results.pvalue, | ||
"kendalltau_p_val": kendall_results.pvalue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"kendalltau_p_val": kendall_results.pvalue, | |
f"{self.main_score}_p_val": kendall_results.pvalue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current main score is with a "_b" suffix, so perhaps they should be consistent?
main_score = "kendalltau_b"
metric.metric, MetricWithConfidenceInterval | ||
): | ||
metric.metric.disable_confidence_interval_calculation() | ||
metric.disable_confidence_interval_calculation() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need an abstract disable_confidence_interval_calculation()
in Metric
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently IIUC only the "overrides" were defined ..
c0c1cf2
to
afc8030
Compare
Signed-off-by: lilacheden <lilach.edel@gmail.com>
…es issue) Signed-off-by: lilacheden <lilach.edel@gmail.com>
Signed-off-by: lilacheden <lilach.edel@gmail.com>
Signed-off-by: lilacheden <lilach.edel@gmail.com>
…fix set_n_resamples Signed-off-by: lilacheden <lilach.edel@gmail.com>
Signed-off-by: lilacheden <lilach.edel@gmail.com>
Signed-off-by: lilacheden <lilach.edel@gmail.com>
d57af26
to
6fba78a
Compare