Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delta-delta-p (ddp) tree inference approach #327

Merged
merged 3 commits into from
Jun 8, 2021
Merged

Add delta-delta-p (ddp) tree inference approach #327

merged 3 commits into from
Jun 8, 2021

Conversation

jroessler
Copy link
Contributor

Proposed changes

As suggested in #300 I implemented the delta-delta-p (in short: DDP) tree based inference method originally proposed in Hansotia and Rukstales: Incremental value modeling (2002). I simply added another evaluationFunction in causalml/inference/tree/models.py called 'DDP'.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

There was no need to add an additional test for proving that the new tree based inference method works. You can simply use the test_uplift_trees.py test and provide evaluationFunction='DDP' when creating the UpliftRandomForestClassifier and UpliftTreeClassifier instances. I noticed that my test does not pass if N_SAMPLE = 1000 because the performance of random targeting was better than the DDP approach. However increasing IN_SAMPLE to, for example, 10.000 worked.

If you need additional information, don't hesitate to ask me!

@CLAassistant
Copy link

CLAassistant commented May 3, 2021

CLA assistant check
All committers have signed the CLA.

@paullo0106 paullo0106 requested a review from t-tte May 4, 2021 22:54
leftScore1 = evaluationFunction(leftNodeSummary, control_name=self.control_name)
rightScore2 = evaluationFunction(rightNodeSummary, control_name=self.control_name)
gain = np.abs(leftScore1 - rightScore2)
gain_for_imp = (len(X_l) * leftScore1 - len(X_r) * rightScore2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jroessler I'm not familiar with this method, but just want to confirm if it's the case that gain takes the absolute difference while gain_for_imp doesn't

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paullo0106 Thanks for the pointer! Actually it's a mistake! It should be the absolute value for gain_for_imp as well. I'll fix it and resubmit.

…P is used with more than two treatment options
@jroessler
Copy link
Contributor Author

I added raising an exception if the DDP approach is used with more than two treatment options as mentioned here:
#300 (comment)

Further, regarding the tests, I added a method called generate_classification_data_two_treatments which creates a synthetic data set with only two treatment options. You can use this method within test_UpliftTreeClassifier and test_UpliftRandomForestClassifier to verify that the DDP approach is working as excepted. As a side note: I realized that the results for all evaluation functions (not only for the DDP approach but also for KL, ED, and CHI) changed from test (RandomForestTest) to test. Thus, it seems that something with random_state is not working properly.

@jeongyoonlee
Copy link
Collaborator

Thanks, @jroessler for the contribution. It looks good to me. @t-tte, could you check this PR?

Copy link
Collaborator

@t-tte t-tte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution.

LGTM also.

@jeongyoonlee jeongyoonlee merged commit df3830d into uber:master Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants