Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing of an expression type column does not work when the expression wraps to a new line #3011

Closed
RichardRedding opened this issue Aug 24, 2018 · 6 comments
Labels
non-atomic column e.g. list columns, S4 vector columns

Comments

@RichardRedding
Copy link

RichardRedding commented Aug 24, 2018

Printing of an expression type column does not work when the expression wraps to a new line

Works:

data.table(e = expression(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12))

Doesn't:

data.table(e = expression(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13))

At least on my computer, outside of a data.table, the first expression would print all on one line and the second would be split over two.

***I have just noticed when doing rbindlist that it says expression type columns are not supported
***Therefore, maybe the issue is that I was able to create a data.table with an expression type column at all

P.S. I was unable to download the development version of data.table as per the wiki:
data.table::update.dev.pkg(repo="https://gitlab.com/Rdatatable/data.table") Error: Line starting '<!DOCTYPE html> ...' is malformed!

Instead I did:
devtools::install_github("Rdatatable/data.table")

@jangorecki
Copy link
Member

jangorecki commented Aug 24, 2018

as for the upgrade call, the url was incorrect, I already corrected it in wiki. data.table::update.dev.pkg(repo="https://Rdatatable.gitlab.io/data.table")
if you can compile from source (you have Rtools, etc) you can just call data.table::update.dev.pkg().

@MichaelChirico
Copy link
Member

I have just noticed when doing rbindlist that it says expression type columns are not supported

data.table has no issue with storing expressions in a column (should be a list column I guess?), the error is coming from rbindlist which doesn't support stitching together such columns... you'll have to make a workaround for this for now.

@MichaelChirico
Copy link
Member

As to your main issue, on my machine:

data.table(e = expression(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13)

is an error:

Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent

with traceback():

3: `rownames<-`(`*tmp*`, value = paste0(format(rn, right = TRUE, 
       scientific = FALSE), ":")) at print.data.table.R#73
2: print.data.table(x)
1: function (x, ...) 
   UseMethod("print")(x)

So there's something wrong in print.data.table...

Ideally you'll be able to use #3414 to customize a print method for expression columns, but it probably shouldn't be erroring in the meantime.

@MichaelChirico
Copy link
Member

I see... it's breaking because the output wraps to a new line but print.data.table expects only one line since the table is 1-row:

expression(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13)
# expression(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 
#     13)

I'll fix this in #3414

@MichaelChirico
Copy link
Member

For the record, I dove into this a bit; print is using print.default, internally PrintExpression I think is using R_print.cutoff to truncate but I don't think we have access to that, so there's no way to prevent the wrapping on the expression directly AFAICT

@jangorecki
Copy link
Member

jangorecki commented May 24, 2020

As of now the most recent conclusion about expression type as a data.table column, is to keep it inside a list. Expression in general is a vector of language objects. The problem is that whenever any of its input is just atomic scalar, then its type is not language anymore.

sapply(expression(-1,1), is.language)
# [1]  TRUE FALSE

so because it can mix types, it behaves more like a list than a vector.
Because it behave like a list, it then make sense to not duplicate the work we have already for handling lists.
Unless of course someone can show a case where wrapping expression into list impose an overhead, then it make sense to consider supporting that directly.
Closing that for now, yet your feedback is very welcome, ideally in #4040.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-atomic column e.g. list columns, S4 vector columns
Projects
None yet
Development

No branches or pull requests

3 participants