Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split fails on sf-data.table-data.frame objects #5365

Open
barryrowlingson opened this issue Apr 8, 2022 · 1 comment · Fixed by #5368
Open

split fails on sf-data.table-data.frame objects #5365

barryrowlingson opened this issue Apr 8, 2022 · 1 comment · Fixed by #5368

Comments

@barryrowlingson
Copy link

barryrowlingson commented Apr 8, 2022

Creating an sf spatial data frame from a data table, and then running split on it using a column for the split results in either an error or the object being split by columns instead of rows.

Example

library(data.table)
library(sf)
d = data.table(x=1:10, y=1:10, s=sample(1:2, 10, TRUE))
dsf = st_as_sf(d, coords=1:2)
split(d, d$s)
$`1`
   x y s
1: 5 5 1
2: 8 8 1
3: 9 9 1

$`2`
    x  y s
1:  1  1 2
2:  2  2 2
3:  3  3 2
4:  4  4 2
5:  6  6 2
6:  7  7 2
7: 10 10 2
split(dsf, dsf$s)
# Error in `[.data.frame`(x, i) : undefined columns selected

Works fine if d is a data.frame:

d = data.frame(x=1:10, y=1:10, s=sample(1:2, 10, TRUE))
dsf = st_as_sf(d, coords=1:2)
split(dsf, dsf$s)
$`1`
Simple feature collection with 2 features and 1 field
Geometry type: POINT
[etc]

Related twitter thread here: https://twitter.com/michael_chirico/status/1512116488492048393?s=20&t=WSUB5c9_zI7suQNPjIFS6Q

# Output of sessionInfo()

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /opt/R-4.1.1/lib/R/lib/libRblas.so
LAPACK: /opt/R-4.1.1/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sf_1.0-5          data.table_1.14.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7         magrittr_2.0.1     units_0.7-2        tidyselect_1.1.1  
 [5] R6_2.5.1           rlang_0.4.12       fansi_0.5.0        dplyr_1.0.7       
 [9] tools_4.1.1        grid_4.1.1         KernSmooth_2.23-20 utf8_1.2.2        
[13] e1071_1.7-9        DBI_1.1.1          ellipsis_0.3.2     class_7.3-19      
[17] assertthat_0.2.1   tibble_3.1.4       lifecycle_1.0.1    crayon_1.4.2      
[21] purrr_0.3.4        vctrs_0.3.8        glue_1.5.0         proxy_0.4-26      
[25] compiler_4.1.1     pillar_1.6.3       generics_0.1.0     classInt_0.4-3    
[29] pkgconfig_2.0.3   
> 
@MichaelChirico
Copy link
Member

Thanks. Not totally clear that it's our bug -- by inheriting from data.table, presumably they should abide by the same [ semantics.

But I think we can nevertheless accommodate this easily -- the issue (as pointed out on Twitter) is just that not providing j is not allowed in their method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants