Added Interaction Tree (IT), Causal Inference Tree (CIT), and Invariant DDP (IDDP) #562

jroessler · 2022-10-23T10:37:59Z

Proposed changes

As discussed in #530, I added the Interaction Tree (IT) and the Causal Inference Tree (CIT). To be more specific, I added their splitting criteria in causalml's uplift tree implementation.

Moreover, I also added the Invariant DDP (IDDP) method. This method will be published soon in the International Conference on Information Systems (in December, 2022). I was able to leverage causalml's infrastructure to come up with a new tree-based algorithm by combining recent findings from uplift modeling and heterogeneous treatment effects literature. (if you want to know more about the method, let me know. I can also share the manuscript). One functionality I had to add, was the honesty approach by Athey and Imbens (2016): Before growing a tree, they split the training sample into an estimation sample 𝑆_est, which is used only for CATE score estimation in the leaves, and a training sample 𝑆_tr, which is used only for selecting tree splits. Thus, all tree-based algorithms can now be combined with the honesty approach if honesty==True (default=False). Further, you can set the size of the estimation sample S_est with estimation_sample_size (default: 0.5).

For all three approaches I added the corresponding documentation and doc strings. However, note that I could not update the sphinx documentation for the causalml.html page - please let me know how to create it!

I also parameterized the UpliftTreeClassifierTests such that we test all evaluation functions.

Types of changes

What types of changes does your code introduce to CausalML?

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation Update (if none of the other choices apply)

Checklist

I have read the CONTRIBUTING doc
I have signed the CLA
Lint and unit tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in downstream modules

Further comments

Literature:
IT: Su, Xiaogang, et al. "Subgroup analysis via recursive partitioning." Journal of Machine Learning Research 10.2 (2009).
CIT: Su, Xiaogang, et al. "Facilitating score and causal inference trees for large observational studies." Journal of Machine Learning Research 13 (2012): 2955.
IDDP: Rößler et al. "The Best of Two Worlds: Using Recent Advances from Uplift Modeling and Heterogeneous Treatment Effects to Optimize Targeting Policies". International Conference on Information Systems (Forthcoming)
Honesty: Athey, Susan, and Guido Imbens. "Recursive partitioning for heterogeneous causal effects." Proceedings of the National Academy of Sciences 113.27 (2016): 7353-7360.

…ndom classifier. That is why I removed the check for both methods

jroessler · 2022-12-20T10:17:22Z

@jeongyoonlee Is there anything I can do to speed up the PR?

jeongyoonlee · 2022-12-23T18:44:03Z

Sorry for not getting back to you sooner, @jroessler. I added @t-tte and @zhenyuz0500 as reviewers and pinged them separately.

jeongyoonlee · 2023-01-20T18:13:43Z

@t-tte, @zhenyuz0500 any updates on this?

t-tte

Hi all, I've reviewed the other commits apart from Added IDDP Implementation, and it all looks good to me. The unreviewed commit is slightly more complex and will take more time, or alternatively if @zhenyuz0500 or @jeongyoonlee has the bandwidth to look into it, I'm happy for the PR to be merged.

jroessler · 2023-01-25T12:27:46Z

You can find the paper about IDDP here:
https://aisel.aisnet.org/icis2022/data_analytics/data_analytics/9/

Maybe it helps! Let me know if you have any questions.

jeongyoonlee · 2023-03-03T18:11:44Z

causalml/inference/tree/uplift.pyx

        self.max_depth = max_depth
        self.min_samples_leaf = min_samples_leaf
        self.min_samples_treatment = min_samples_treatment
        self.n_reg = n_reg
        self.max_features = max_features
+
+        assert evaluationFunction is not None and evaluationFunction in ['KL', 'ED', 'Chi', 'CTS', 'DDP', 'IT', 'CIT', 'IDDP'], \


The first condition, evaluationFunction is not None, is not necessary because if evaluationFunction is None, it won't meet the second condition anyway.

jeongyoonlee · 2023-03-03T18:15:36Z

causalml/inference/tree/uplift.pyx

-        if self.evaluationFunction == self.evaluate_DDP and self.n_class > 2:
-            raise ValueError("The DDP approach can only cope with two class problems, that is two different treatment "
+        if self.n_class > 2 and (self.evaluationFunction == self.evaluate_DDP or self.evaluationFunction == self.evaluate_IDDP or
+                                 self.evaluationFunction == self.evaluate_IT or self.evaluationFunction == self.evaluate_CIT):


Let's use:

if (self.n_class > 2) and (self.evaluationFunction in [self.evaluate_DDP, self.evaluate_IDDP, self.evaluate_IT, self.evaluate_CIT]):

jeongyoonlee · 2023-03-03T18:19:07Z

causalml/inference/tree/uplift.pyx

                             "options (e.g., control vs treatment). Please select another approach or only use a "
                             "dataset which employs two treatment options.")

+        if self.honesty:
+            try:
+                X, X_est, treatment_idx, treatment_idx_est, y, y_est = sklearn.model_selection.train_test_split(X,


Let's do:

from sklearn.model_selection import train_test_split ... X, X_est, treatment_idx, treatment_idx_est, y, y_est = train_test_split(X, ...

causalml/inference/tree/uplift.pyx

jeongyoonlee · 2023-03-03T18:32:05Z

causalml/inference/tree/uplift.pyx

+        if self.honesty:
+            self.honestApproach(X_est, treatment_idx_est, y_est)
+
+        with np.errstate(divide='ignore',invalid='ignore'):


Similarly, if there's an error, e.g., all feature importances being zero in this case, we'd like the code to fail with a proper error message.

tests/test_uplift_trees.py

jeongyoonlee

I made a couple of comments. My main feedback is that, to check if a new implementation works properly, we need it to perform better than random. We can update the current test dataset by making the treatment effect bigger to make the test easier to pass.

jroessler · 2023-03-27T18:53:15Z

@jeongyoonlee Any idea what went wrong during the test of the latest commit? Is that something I can fix on my side?

jeongyoonlee

LGTM. Thanks for the contribution.

causalml/inference/tree/uplift.pyx

…nt DDP (IDDP) (#562) * Added Interaction Tree Implementation * Added Conditional Interaction Tree Implementation * Added IDDP Implementation * Added documentation for IT, CIT, and IDDP * Fixed alignment issue in methodology * added performance checks and resolved remaining minor issues

jroessler added 9 commits October 22, 2022 19:07

Added Interactin Tree Implementation

ac1c4ea

Added Conditional Interaction Tree Implementation

b4a02b0

Added IDDP Implementation

16b46c6

Used black formatter

c596ceb

Added documentation for IT, CIT, and IDDP

9fb72b8

Finalized documentation for IT, CIT, and IDDP

b17a79e

Fixed alignment issue in methodology

b482a1a

For the given synthetic dataset, IT and IDDP are not better than a ra…

6aa8652

…ndom classifier. That is why I removed the check for both methods

Used black formatter

210d580

volico mentioned this pull request Nov 10, 2022

Feature/ttest criterion #570

Merged

10 tasks

jeongyoonlee requested review from zhenyuz0500 and t-tte December 23, 2022 18:32

t-tte reviewed Jan 21, 2023

View reviewed changes

jeongyoonlee reviewed Mar 3, 2023

View reviewed changes

causalml/inference/tree/uplift.pyx Show resolved Hide resolved

jeongyoonlee reviewed Mar 3, 2023

View reviewed changes

tests/test_uplift_trees.py Show resolved Hide resolved

jeongyoonlee requested changes Mar 3, 2023

View reviewed changes

added performance checks and resolved remaining minor issues

a831433

jeongyoonlee approved these changes Jul 8, 2023

View reviewed changes

causalml/inference/tree/uplift.pyx Show resolved Hide resolved

jeongyoonlee merged commit 60cc631 into uber:master Jul 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Interaction Tree (IT), Causal Inference Tree (CIT), and Invariant DDP (IDDP) #562

Added Interaction Tree (IT), Causal Inference Tree (CIT), and Invariant DDP (IDDP) #562

jroessler commented Oct 23, 2022

jroessler commented Dec 20, 2022

jeongyoonlee commented Dec 23, 2022

jeongyoonlee commented Jan 20, 2023

t-tte left a comment

jroessler commented Jan 25, 2023

jeongyoonlee Mar 3, 2023

jeongyoonlee Mar 3, 2023

jeongyoonlee Mar 3, 2023

jeongyoonlee Mar 3, 2023

jeongyoonlee left a comment

jroessler commented Mar 27, 2023

jeongyoonlee left a comment

Added Interaction Tree (IT), Causal Inference Tree (CIT), and Invariant DDP (IDDP) #562

Added Interaction Tree (IT), Causal Inference Tree (CIT), and Invariant DDP (IDDP) #562

Conversation

jroessler commented Oct 23, 2022

Proposed changes

Types of changes

Checklist

Further comments

jroessler commented Dec 20, 2022

jeongyoonlee commented Dec 23, 2022

jeongyoonlee commented Jan 20, 2023

t-tte left a comment

Choose a reason for hiding this comment

jroessler commented Jan 25, 2023

jeongyoonlee Mar 3, 2023

Choose a reason for hiding this comment

jeongyoonlee Mar 3, 2023

Choose a reason for hiding this comment

jeongyoonlee Mar 3, 2023

Choose a reason for hiding this comment

jeongyoonlee Mar 3, 2023

Choose a reason for hiding this comment

jeongyoonlee left a comment

Choose a reason for hiding this comment

jroessler commented Mar 27, 2023

jeongyoonlee left a comment

Choose a reason for hiding this comment