Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename time column internally to protect against extra_time + as.factor conflict #320

Closed
Lewis-Barnett-NOAA opened this issue Mar 13, 2024 · 2 comments

Comments

@Lewis-Barnett-NOAA
Copy link
Collaborator

I've noticed that models with extra_time specified are causing estimation problems in Windows.

For example:
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 20)
fit <- sdmTMB(
density ~ 0 + as.factor(year),
data = pcod,
mesh = mesh,
time = "year",
family = tweedie(link = "log"),
spatiotemporal = "ar1",
extra_time = c(2006, 2008, 2010, 2012, 2014, 2016)
)

produces the error:
Error in solve.default(h, g) :
Lapack routine dgesv: system is exactly singular: U[4,4] = 0

sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] sdmTMB_0.4.3

loaded via a namespace (and not attached):
[1] sandwich_3.0-2 utf8_1.2.4 generics_0.1.3 class_7.3-22 fmesher_0.1.5 KernSmooth_2.23-21
[7] lattice_0.21-8 lme4_1.1-35.1 magrittr_2.0.3 grid_4.3.0 estimability_1.4.1 mvtnorm_1.2-4
[13] Matrix_1.6-5 e1071_1.7-14 DBI_1.2.2 survival_3.5-5 multcomp_1.4-23 mgcv_1.8-42
[19] fansi_1.0.6 TH.data_1.1-2 codetools_0.2-19 cli_3.6.1 rlang_1.1.3 units_0.8-5
[25] splines_4.3.0 reprex_2.1.0 withr_3.0.0 tools_4.3.0 nloptr_2.0.3 coda_0.19-4
[31] minqa_1.2.6 dplyr_1.1.4 boot_1.3-28.1 assertthat_0.2.1 vctrs_0.6.5 R6_2.5.1
[37] zoo_1.8-12 proxy_0.4-27 lifecycle_1.0.4 emmeans_1.8.5 classInt_0.4-10 fs_1.6.3
[43] MASS_7.3-60 pkgconfig_2.0.3 pillar_1.9.0 glue_1.7.0 Rcpp_1.0.12 sf_1.0-15
[49] tibble_3.2.1 tidyselect_1.2.1 rstudioapi_0.15.0 xtable_1.8-4 nlme_3.1-162 TMB_1.9.10
[55] compiler_4.3.0 sp_2.1-3

@seananderson
Copy link
Member

Nothing specific to do with Windows. This should fix it:

library(sdmTMB)
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 20)
pcod$year_f <- as.factor(pcod$year)
fit <- sdmTMB(
  density ~ 0 + year_f,
  data = pcod,
  mesh = mesh,
  time = "year",
  family = tweedie(link = "log"),
  spatiotemporal = "ar1",
  extra_time = c(2006, 2008, 2010, 2012, 2014, 2016)
)
fit
#> Spatiotemporal model fit by ML ['sdmTMB']
#> Formula: density ~ 0 + year_f
#> Mesh: mesh (isotropic covariance)
#> Time column: year
#> Data: pcod
#> Family: tweedie(link = 'log')
#>  
#>            coef.est coef.se
#> year_f2003     2.74    0.37
#> year_f2004     3.01    0.36
#> year_f2005     3.04    0.36
#> year_f2007     1.95    0.37
#> year_f2009     2.30    0.37
#> year_f2011     2.87    0.35
#> year_f2013     2.99    0.35
#> year_f2015     3.01    0.36
#> year_f2017     2.35    0.37
#> 
#> Dispersion parameter: 13.67
#> Tweedie p: 1.56
#> Spatiotemporal AR1 correlation (rho): -0.63
#> Matérn range: 3.78
#> Spatial SD: 11.17
#> Spatiotemporal marginal AR1 SD: 7.94
#> ML criterion at convergence: 6519.955
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.

Created on 2024-03-12 with reprex v2.1.0

The issue is that the as.factor(year) uses the same column as the year argument. The extra_time argument creates the extra time slices and then those get added as factor levels for which there are no data to inform them. The solution is to not use the same column for something with as.factor() as the time column when extra_time is used. I'm not sure what the best way is to defend against this. One option would be to try to detect it and issue a warning or error. Another option would be to rename the time column internally so that the two will never conflict... maybe that option is best.

@seananderson seananderson changed the title extra_time bug? Rename time column internally to protect against extra_time + as.factor conflict Mar 13, 2024
@Lewis-Barnett-NOAA
Copy link
Collaborator Author

Ah, that is the workaround, thanks! I concur that renaming the time column internally to prevent this sounds like a good idea.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants