Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dplyr_col_select() reconstructs bare data.table objects #6171

Merged
merged 4 commits into from
Feb 2, 2022

Conversation

romainfrancois
Copy link
Member

# pak::pak("Rdatatable/data.table")
library(data.table)
library(dplyr, warn.conflicts = FALSE)

d1 <- data.table(x = 1)
attr(d1, "foo") <- "bar"

d2 <- dplyr:::dplyr_col_select(d1, "x")
attr(d2, "foo")
#> [1] "bar"

Created on 2022-02-02 by the reprex package (v2.0.1)

as opposed to losing the attribute:

# pak::pak("Rdatatable/data.table")
library(data.table)
library(dplyr, warn.conflicts = FALSE)

d1 <- data.table(x = 1)
attr(d1, "foo") <- "bar"

d2 <- dplyr:::dplyr_col_select(d1, "x")
attr(d2, "foo")
#> NULL

Created on 2022-02-02 by the reprex package (v2.0.1)

however, this only works with the dev version of data.table thanks to: Rdatatable/data.table#4909

This is related to: fndemarqui/covid19br#1

```
Warning message:
Topic 'glimpse': no parameters to inherit with @inheritParams
```
@DavisVaughan
Copy link
Member

I am quite certain we don't want this to be data.table aware. The [i] method of data table selects rows rather than columns.

library(data.table)

# yay 32 rows, 0 cols
mtcars[0]
#> data frame with 0 columns and 32 rows

# 0 rows, 11 cols
as.data.table(mtcars)[0]
#> Empty data.table (0 rows and 11 cols): mpg,cyl,disp,hp,drat,wt...

Try this with your two examples:

library(data.table)
library(dplyr, warn.conflicts = FALSE)

d1 <- data.table(x = 1:2, y = 1:2)
attr(d1, "foo") <- "bar"

d2 <- dplyr:::dplyr_col_select(d1, "x")
d2

@romainfrancois
Copy link
Member Author

oh right. So maybe this is rather something to do in:

  if (identical(class(.data), "data.frame")) {
    out <- dplyr_reconstruct(out, .data)
  }

@romainfrancois
Copy link
Member Author

Because this ends up calling [.data.frame directly then: https://github.com/Rdatatable/data.table/blob/8b257b8c2f638567c2f7166f6bc3e086cf4195c2/R/data.table.R#L146

  if (!cedta()) {
    # Fix for #500 (to do)
    Nargs = nargs() - (!missing(drop))
    ans = if (Nargs<3L) { `[.data.frame`(x,i) }  # drop ignored anyway by DF[i]
        else if (missing(drop)) `[.data.frame`(x,i,j)
        else `[.data.frame`(x,i,j,drop)
    # added is.data.table(ans) check to fix bug #81
    if (!missing(i) && is.data.table(ans)) setkey(ans, NULL)  # drops index too; tested by plyr::arrange test in other.Rraw
    return(ans)
  }

@romainfrancois romainfrancois removed the request for review from hadley February 2, 2022 15:11
@DavisVaughan
Copy link
Member

Yea as you've seen I think the issue is basically that in non-data.table aware paths they just hand off to [.data.frame which doesn't retain attributes

@@ -240,7 +240,7 @@ dplyr_col_select <- function(.data, loc, names = NULL, error_call = caller_env()

# Patch base data frames to restore extra attributes that `[.data.frame` drops.
# We require `[` methods to keep extra attributes for all data frame subclasses.
if (identical(class(.data), "data.frame")) {
if (identical(class(.data), "data.frame") || identical(class(.data), c("data.table", "data.frame"))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to also mention the data.table handling in the comment and link to this PR

@romainfrancois romainfrancois changed the title mark dplyr_col_select() data table aware dplyr_col_select() reconstructs bare data.table objects Feb 2, 2022
@romainfrancois romainfrancois mentioned this pull request Feb 2, 2022
17 tasks
@romainfrancois
Copy link
Member Author

mention this pr in comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants