Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example in the documentation is bad practice #30

Open
ghost opened this issue Nov 22, 2020 · 2 comments
Open

example in the documentation is bad practice #30

ghost opened this issue Nov 22, 2020 · 2 comments

Comments

@ghost
Copy link

ghost commented Nov 22, 2020

The example in the documentation is bad practice as the output of the linear model is constant (underfit)
https://justcause.readthedocs.io/en/latest/

>>> from justcause.data.sets import load_ihdp
>>> from justcause.learners import SLearner
>>> from justcause.learners.propensity import estimate_propensities
>>> from justcause.metrics import pehe_score, mean_absolute
>>> from justcause.evaluation import calc_scores

>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import LinearRegression

>>> import pandas as pd

>>> replications = load_ihdp(select_rep=[0, 1, 2])
>>> slearner = SLearner(LinearRegression())
>>> metrics = [pehe_score, mean_absolute]
>>> scores = []

>>> for rep in replications:
>>>    train, test = train_test_split(rep, train_size=0.8)
>>>    p = estimate_propensities(train.np.X, train.np.t)
>>>    slearner.fit(train.np.X, train.np.t, train.np.y, weights=1/p)
>>>    pred_ite = slearner.predict_ite(test.np.X, test.np.t, test.np.y)
>>>    scores.append(calc_scores(test.np.ite, pred_ite, metrics))

>>> pd.DataFrame(scores)
   pehe_score  mean_absolute
0    0.998388       0.149710
1    0.790441       0.119423
2    0.894113       0.151275

When one looks at pred_ite the standard deviation is almost zero. The predictive power of the model is practically zero.
Thus, the example should either include some relative evaluation relative to the dummy model (e.g. constant).

pred_ite.std()

1.130466570252318e-15

@ghost
Copy link
Author

ghost commented Nov 30, 2020

there is also a bug. the weight should be t/p + (1-t)/(1-p) rather than just 1/p

@MaximilianFranz
Copy link
Contributor

Hey, thanks for reporting the issue!

Unfortunately, both me and @FlorianWilhelm are currently not working on the package anymore. However, if you have a solution for the above bug (by adjusting the weights) feel free to open a pull request with the changes to the docs..

Do you have a source for the new configuration of weights?

Best,
Max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant