Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use sum contrasts for predictions with glmnetUtils #34

Open
FelixNoessler opened this issue Jan 15, 2024 · 0 comments
Open

How to use sum contrasts for predictions with glmnetUtils #34

FelixNoessler opened this issue Jan 15, 2024 · 0 comments

Comments

@FelixNoessler
Copy link

Hello,

thanks for creating the R package!

I have some problems with fitting an model with sum contrasts. The coefficients seem fine (similar to a non-regularized model), but the predictions are far off when I fit a model with sum contrasts. Here is a reproducible example (sorry, it is not super short):

library(dplyr)
library(ggplot2)
library(tidyr)
library(glmnetUtils)

df <- palmerpenguins::penguins

df <- df %>%
  select(species, body_mass_g) %>%
  drop_na()

sum_contr_species <- contr.sum(3)
colnames(sum_contr_species) <- levels(df$species)[1:2] 

treatment_contr_species <- contr.treatment(3)
colnames(treatment_contr_species) <- levels(df$species)[2:3] 

## fit linear models for comparison with sum and treatment contrasts
lm_sum_fit <- lm(body_mass_g ~ species, data = df,
             contrasts = list(species = sum_contr_species))
lm_sum_fit

lm_treatment_fit <- lm(body_mass_g ~ species, data = df,
                       contrasts = list(species = treatment_contr_species))
lm_treatment_fit

## fit elastic net models: default, treatment, and sum contrasts
alpha_vals <- c(0, 0.1, 0.5, 0.7, 0.8, 0.9, 0.95, 0.99, 1)
elnet_default <- cva.glmnet(
  body_mass_g ~ species, 
  alpha = alpha_vals,
  data = df, 
  nfolds = 15,
  use.model.frame = F)

selected_alpha <- 0.9
coef(elnet_default, s = "lambda.1se", alpha = selected_alpha)

elnet_treatment <- cva.glmnet(
  body_mass_g ~ species, 
  alpha = alpha_vals,
  data = df, 
  nfolds = 15,
  use.model.frame = T)
coef(elnet_treatment, s = "lambda.1se", alpha = selected_alpha)

contrasts(df$species) <- sum_contr_species 
elnet_sum <- cva.glmnet(
  body_mass_g ~ species, 
  alpha = alpha_vals,
  data = df, 
  nfolds = 15,
  use.model.frame = T)
coef(elnet_sum, s = "lambda.1se", alpha = selected_alpha)

## make predictions for all models
df_pred <- tibble(species = unique(df$species)) %>%
  mutate(
    lin_sum = predict(lm_sum_fit, newdata = .),
    lin_treatment = predict(lm_treatment_fit, newdata = .),
    elnet_default = predict(elnet_default, newdata = ., 
                            s = "lambda.1se", alpha = selected_alpha)[,1],
    elnet_sum = predict(elnet_sum, newdata = ., 
                        s = "lambda.1se", alpha = selected_alpha)[,1],
    elnet_treatment = predict(elnet_treatment, newdata = ., 
                      s = "lambda.1se", alpha = selected_alpha)[,1],
  )

ggplot() +
  geom_point(aes(species, value, color = name),
             data = pivot_longer(df_pred, -species),
             size = 5) +
  geom_dotplot(aes(species, body_mass_g), data = df,
               binaxis = "y", dotsize = 0.5, 
               stackdir = "center", alpha = 0.7,
               fill = NaN)

plot

@FelixNoessler FelixNoessler changed the title Hot to use sum contrasts for predictions with glmnetUtils How to use sum contrasts for predictions with glmnetUtils Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant