Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
olivierlabayle committed Aug 10, 2023
1 parent f666975 commit 19e27ee
Show file tree
Hide file tree
Showing 8 changed files with 48 additions and 27 deletions.
3 changes: 1 addition & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ nothing # hide
Estimating the Average Treatment Effect can of ``T`` on ``Y`` can be as simple as:

```@example quick-start
Ψ = ATE(:Y, (T=(case=true, control = false),), :W)
Ψ = ATE(outcome=:Y, treatment=(T=(case=true, control = false),), confounders=:W)
result, _ = tmle!(Ψ, dataset)
result
```
Expand All @@ -80,7 +80,6 @@ and second, define the Average Treatment Effect of the treatment ``T`` on the ou
outcome = :Y,
treatment = (T=(case=true, control = false),),
)
nothing # hide
```

Note that in this example the ATE can be computed exactly and is given by:
Expand Down
4 changes: 1 addition & 3 deletions docs/src/user_guide/adjustment.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@ In a `SCM`, each variable is determined by a set of parents and a statistical mo

At the moment we provide a single adjustment method, namely the Backdoor adjustment method. The adjustment set consists of all the treatment variable's parents. Additional covariates used to fit the outcome model can be provided via `outcome_extra`.

```@example
using TMLE # hide
```julia
BackdoorAdjustment(;outcome_extra=[:C])
nothing
```
5 changes: 5 additions & 0 deletions docs/src/user_guide/estimation.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ using StableRNGs
using CategoricalArrays
using TMLE
using LogExpFunctions
using MLJLinearModels
function make_dataset(;n=1000)
rng = StableRNG(123)
Expand Down Expand Up @@ -59,6 +60,7 @@ result₁, fluctuation_mach = tmle!(Ψ₁, dataset;
threshold=1e-8,
weighted_fluctuation=false
)
nothing # hide
```

We see that both models corresponding to variables `Y` and `T₁` were fitted in the process but that the model for `T₂` was not because it was not necessary to estimate this estimand.
Expand Down Expand Up @@ -92,6 +94,7 @@ result₂, fluctuation_mach = tmle!(Ψ₂, dataset;
threshold=1e-8,
weighted_fluctuation=false
)
nothing # hide
```

The model for `T₂` was fitted in the process but so was the model for `Y` 🤔. This is because the `BackdoorAdjustment` method determined that the set of inputs for `Y` were different in both cases.
Expand All @@ -109,6 +112,7 @@ result₃, fluctuation_mach = tmle!(Ψ₃, dataset;
threshold=1e-8,
weighted_fluctuation=false
)
nothing # hide
```

This time only the statistical model for `Y` is fitted again while reusing the models for `T₁` and `T₂`. Finally, let's see what happens if we estimate the `IATE` between `T₁` and `T₂`.
Expand All @@ -122,6 +126,7 @@ result₄, fluctuation_mach = tmle!(Ψ₄, dataset;
threshold=1e-8,
weighted_fluctuation=false
)
nothing # hide
```

All statistical models have been reused 😊!
Expand Down
3 changes: 1 addition & 2 deletions docs/src/user_guide/misc.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ To account for the fact that treatment variables are categorical variables we pr

Such transformer can be created with:

```@example
using TMLE # hide
```julia
TreatmentTransformer(;encoder=encoder())
```

Expand Down
20 changes: 9 additions & 11 deletions docs/src/user_guide/scm.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,21 @@ In TMLE.jl, everything starts from the definition of a Structural Causal Model (

All models are wrong? Well maybe not the following:

```@example scm-incremental
```@example scm
using TMLE # hide
scm = SCM()
```

This model does not say anything about the random variables and is thus not really useful. Let's assume that we are interested in an outcome ``Y`` and that this outcome is determined by 8 other random variables. We can add this assumption to the model

```@example scm-incremental
```@example scm
push!(scm, SE(:Y, [:T₁, :T₂, :W₁₁, :W₁₂, :W₂₁, :W₂₂, :W, :C]))
```

At this point, we haven't made any assumption regarding the functional form of the relationship between ``Y`` and its parents. We can add a further assumption by setting a statistical model for ``Y``, suppose we know it is generated from a logistic model, we can make that explicit:

```@example scm-incremental
```@example scm
using MLJLinearModels
setmodel!(scm.Y, LogisticClassifier())
```

Expand All @@ -32,14 +33,14 @@ setmodel!(scm.Y, LogisticClassifier())

- TMLE.jl is based on the main machine-learning framework in Julia: [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/). As such, any model respecting the MLJ interface is a valid model in TMLE.jl.
- In real world scenarios, we usually don't know what is the true statistical model for each variable and want to keep it as large as possible. For this reason it is recommended to use Super-Learning which is implemented in MLJ by the [Stack](https://alan-turing-institute.github.io/MLJ.jl/dev/model_stacking/#Model-Stacking) and comes with theoretical properties.
- In the dataset, treatment variables are represented with [categorical data](https://alan-turing-institute.github.io/MLJ.jl/dev/working_with_categorical_data/). This means the models that depend on such variables will need to properly deal with them. For this purpose we provide a [TreatmentTransformer](@ref) which can easily be combined with any `model` in a [Pipelining](https://alan-turing-institute.github.io/MLJ.jl/dev/linear_pipelines/) flavour with `with_encoder(model)`.
- In the dataset, treatment variables are represented with [categorical data](https://alan-turing-institute.github.io/MLJ.jl/dev/working_with_categorical_data/). This means the models that depend on such variables will need to properly deal with them. For this purpose we provide a `TreatmentTransformer` which can easily be combined with any `model` in a [Pipelining](https://alan-turing-institute.github.io/MLJ.jl/dev/linear_pipelines/) flavour with `with_encoder(model)`.
- The `SCM` has no knowledge of the data and thus cannot verify that the assumed statistical model is compatible with the data. This is done at a later stage.

---

Let's now assume that we have a more complete knowledge of the problem and we also know how `T₁` and `T₂` depend on the rest of the variables in the system.

```@example scm-incremental
```@example scm
push!(scm, SE(:T₁, [:W₁₁, :W₁₂, :W], model=LogisticClassifier()))
push!(scm, SE(:T₂, [:W₂₁, :W₂₂, :W]))
```
Expand All @@ -48,8 +49,7 @@ push!(scm, SE(:T₂, [:W₂₁, :W₂₂, :W]))

Instead of constructing the `SCM` incrementally, one can provide all the specified equations at once:

```@example scm-one-step
using TMLE # hide
```@example scm
scm = SCM(
SE(:Y, [:T₁, :T₂, :W₁₁, :W₁₂, :W₂₁, :W₂₂, :W, :C], with_encoder(LinearRegressor())),
SE(:T₁, [:W₁₁, :W₁₂, :W], model=LogisticClassifier()),
Expand All @@ -63,8 +63,7 @@ Noting that we have used the `with_encoder` function to reflect the fact that we

There are many cases where we are interested in estimating the causal effect of a single treatment variable on a single outcome. Because it is typically only necessary to adjust for backdoor variables in order to identify this causal effect, we provide the `StaticConfoundedModel` interface to build such `SCM`:

```@example static-scm-1
using TMLE # hide
```@example scm
scm = StaticConfoundedModel(
:Y, :T, [:W₁, :W₂];
covariates=[:C],
Expand All @@ -77,8 +76,7 @@ The optional `covariates` are variables that influence the outcome but are not c

This model can be extended to a plate-model with multiple treatments and multiple outcomes. In this case the set of confounders is assumed to confound all treatments which are in turn assumed to impact all outcomes. This can be defined as:

```@example static-scm-2
using TMLE # hide
```@example scm
scm = StaticConfoundedModel(
[:Y₁, :Y₂], [:T₁, :T₂], [:W₁, :W₂];
covariates=[:C],
Expand Down
29 changes: 22 additions & 7 deletions docs/src/walk_through.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ using StableRNGs
using CategoricalArrays
using TMLE
using LogExpFunctions
using MLJLinearModels
function make_dataset(;n=1000)
rng = StableRNG(123)
Expand Down Expand Up @@ -90,7 +91,7 @@ AVAILABLE_ESTIMANDS

At the moment there are 3 main estimand types we can estimate in TMLE.jl, we provide below a few examples.

- The Interventional Conditional Mean (see: TODO):
- The Interventional Conditional Mean:

```@example walk-through
cm = CM(
Expand All @@ -100,10 +101,10 @@ cm = CM(
)
```

- The Average Treatment Effect (see: TODO):
- The Average Treatment Effect:

```@example walk-through
ate = ATE(
total_ate = ATE(
scm,
outcome=:Y,
treatment=(T₁=(case=1, control=0), T₂=(case=1, control=0))
Expand All @@ -115,7 +116,7 @@ marginal_ate_t1 = ATE(
)
```

- The Interaction Average Treatment Effect (see: TODO):
- The Interaction Average Treatment Effect:

```@example walk-through
iate = IATE(
Expand All @@ -127,7 +128,7 @@ iate = IATE(

## Targeted Estimation

Then each parameter can be estimated by calling the `tmle` function. For example:
Then each parameter can be estimated by calling the `tmle!` function. For example:

```@example walk-through
result, _ = tmle!(cm, dataset)
Expand All @@ -136,10 +137,24 @@ result

The `result` contains 3 main elements:

- The `TMLEEstimate` than can be accessed via: `tmle(result)`.
- The `OSEstimate` than can be accessed via: `ose(result)`.
- The `TMLEEstimate` than can be accessed via:

```@example walk-through
tmle(result)
```

- The `OSEstimate` than can be accessed via:

```@example walk-through
ose(result)
```

- The naive initial estimate.

```@example walk-through
naive(result)
```

The adjustment set is determined by the provided `adjustment_method` keyword. At the moment, only `BackdoorAdjustment` is available. However one can specify that extra covariates could be used to fit the outcome model.

```@example walk-through
Expand Down
2 changes: 1 addition & 1 deletion src/TMLE.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ export AverageTreatmentEffect, ATE
export InteractionAverageTreatmentEffect, IATE
export AVAILABLE_ESTIMANDS
export fit!, optimize_ordering, optimize_ordering!
export tmle!, tmle, ose
export tmle!, tmle, ose, naive
export var, estimate, initial_estimate, OneSampleTTest, OneSampleZTest, pvalue, confint
export compose
export TreatmentTransformer, with_encoder
Expand Down
9 changes: 8 additions & 1 deletion src/estimate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,13 @@ struct TMLEResult{P <: Estimand, T<:AbstractFloat}
initial::T
end

function Base.show(io::IO, r::TMLEResult)
function Base.show(io::IO, ::MIME"text/plain", est::AsymptoticallyLinearEstimate)
testresult = OneSampleTTest(est)
data = [estimate(est) confint(testresult) pvalue(testresult);]
pretty_table(io, data;header=["Estimate", "95% Confidence Interval", "P-value"])
end

function Base.show(io::IO, ::MIME"text/plain", r::TMLEResult)
tmletest = OneSampleTTest(r.tmle)
onesteptest = OneSampleTTest(r.onestep)
data = [
Expand All @@ -36,6 +42,7 @@ end

tmle(result::TMLEResult) = result.tmle
ose(result::TMLEResult) = result.onestep
initial(result::TMLEResult) = result.initial

"""
Distributions.estimate(r::AsymptoticallyLinearEstimate)
Expand Down

0 comments on commit 19e27ee

Please sign in to comment.