Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance estimator CBPE calculates realized ROC AUC using calibrated probabilities (while it should use raw) #98

Closed
jakubnml opened this issue Aug 18, 2022 · 1 comment
Assignees
Labels
bug Something isn't working triage Needs to be assessed

Comments

@jakubnml
Copy link
Contributor

Describe the bug
I have noticed differences between realized roc auc calculated by performance estimator vs. performance calculator.

To Reproduce
Steps to reproduce the behavior:

import pandas as pd
import nannyml as nml

reference_df = nml.load_synthetic_binary_classification_dataset()[0]
analysis_df = nml.load_synthetic_binary_classification_dataset()[1]



estimator = nml.CBPE(
    y_pred_proba='y_pred_proba',
    y_pred='y_pred',
    y_true='work_home_actual',
    timestamp_column_name='timestamp',
    metrics=['roc_auc'],
    chunk_size=5000)

estimator.fit(reference_df)

results_estimation = estimator.estimate(reference_df).data

calc = nml.PerformanceCalculator(
    y_pred_proba='y_pred_proba',
    y_pred='y_pred',
    y_true='work_home_actual',
    timestamp_column_name='timestamp',
    metrics=["roc_auc"],
    chunk_size=5000)

calc.fit(reference_df)

results_monitoring = calc.calculate(reference_df).data

results_monitoring['roc_auc'] - results_estimation['realized_roc_auc']

returns:

0   -0.000224
1    0.000146
2   -0.000257
3   -0.000456
4   -0.000133
5   -0.000272
6   -0.000323
7   -0.000271
8    0.000341
9    0.000237
dtype: float64

The values should be the same so it should return zeros only. I noticed that this is due to the fact that CBPE uses calibrated probabilities for realized performance calculation (a bug).

Expected behavior
CBPE should use raw probabilities to calculate realized performance.

@jakubnml jakubnml added bug Something isn't working triage Needs to be assessed labels Aug 18, 2022
@nnansters
Copy link
Contributor

Fixed the issue for both binary and multiclass classification cases.
I've pushed the fix to the main branch, official release is soon to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Needs to be assessed
Projects
None yet
Development

No branches or pull requests

2 participants