-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert readxl from Rcpp to cpp11 #659
Conversation
I had to edit a few other files to define the macros
It was easier then try to figure out how to reorganize the header files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a few cases where as far as I could tell you were converting an R SEXP to another R SEXP using as_cpp()
, which you should never need to do.
The mental model is as_cpp()
is used to convert a SEXP to a native C++ type (like std::string
, int
, etc.) and as_sexp()
is used to go the opposite way, e.g. convert a C++ type to an R SEXP.
But in general you should have to call these explicitly somewhat rarely, most of the time they can be done implicitly in the generated registration code.
src/XlsxWorkSheet.h
Outdated
@@ -198,7 +204,7 @@ class XlsxWorkSheet { | |||
|
|||
xcell->inferType(na, trimWs, wb_.stringTable(), dateFormats_); | |||
CellType type = xcell->type(); | |||
Rcpp::RObject column = cols[col]; | |||
cpp11::sexp column = cpp11::as_sexp(cols[col]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this call to as_sexp()
? I would guess no? maybe just this?
cpp11::sexp column = cpp11::as_sexp(cols[col]); | |
cpp11::sexp column(cols[col]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up having to do something similar to what we did with
std::vector<ColType> colTypes = colTypeStrings(static_cast<SEXP>(col_types));
Because I was getting a constructor error but let me know if there is another solution!
src/XlsxWorkSheet.cpp
Outdated
|
||
// catches empty sheets and sheets where requested rectangle contains no data | ||
if (ws.nrow() == 0 && ws.ncol() == 0) { | ||
return Rcpp::List(0); | ||
cpp11::list x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to be a writable list I think? Otherwise it is going to return NULL
instead of list()
cpp11::list x; | |
cpp11::writable::list(); |
src/XlsxWorkSheet.cpp
Outdated
bool has_col_names = false; | ||
switch(TYPEOF(col_names)) { | ||
case STRSXP: | ||
colNames = as<CharacterVector>(col_names); | ||
colNames = cpp11::as_cpp<cpp11::strings>(col_names); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need any cast? just colNames = col_names
@@ -47,7 +48,7 @@ List read_xlsx_(std::string path, int sheet_i, | |||
if (TYPEOF(col_types) != STRSXP) { | |||
Rcpp::stop("`col_types` must be a character vector"); | |||
} | |||
std::vector<ColType> colTypes = colTypeStrings(as<CharacterVector>(col_types)); | |||
std::vector<ColType> colTypes = colTypeStrings(cpp11::as_cpp<cpp11::strings>(col_types)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure you need a cast here either
This cleans up the call a bit, also not using brace initialization avoids ambigious call during compilation using gcc
#include <set> | ||
#include <vector> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make more sense to keep the system includes, e.g. <whatever>
together and above the ones defined here, e.g. "utils.h"
?
If you agree, that would be a good thing to check globally. I think that is (mostly?) how things started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Thank you for reminding me about this.
I rearranged some of the header files to avoid conflicts with Rcpp and cpp11 libraries so I need to reorganize them (or organize them back to how you had them). I ended up using the macros #define R_NO_REMAP
and #define STRICT_R_HEADERS
instead to avoid the conflicts, which I'll remove since they won't be needed anymore once this PR is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current state seems to be much more regular, but often puts the local headers before the system headers. It feels like a more conventional order would be: system, vendored code (libxls), written-by-us, which also reflects intended precedence. (Let me also acknowledge that this probably doesn't matter and is more a matter of style at this point.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes sorry! I should have reread your comment before executing based on my memory. I'll rearrange in that order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remove the rest of the Rcpp references as well. You can find them with find my files or with git grep Rcpp
- Remove Rcpp references from Imports and LinkingTo
- Remove the importFrom Rcpp roxygen annotation in R/readxl.R (and run devtools::document() afterwards)
- Remove any stray
#include "Rcpp.h"
directives
You may find some compilation errors, as Rcpp includes a number of standard headers which cpp11 may not. If any of these (like std::vector or std::map etc.) are used in the generated cpp11.cpp
code you will need to include them in a special file, src/readxl_types.h
which cpp11 will automatically include in the generated file.
It looks like the only test still failing is on windows release due to |
I think the Rcpp::stop crash is happening because unwind protect does a long jump, and long jumping from a constructor is AFAIK undefined behavior. I am not clear about the cpp11::warning case, but possibly it could be a similar issue.
cpp11 automatically translates strings to UTF-8 when you construct a std::string. Rcpp did not do this, and the existing tests assumed they would be converted in the native encoding. so we do this explicitly using Rf_translateChar()
Woohoo! 🎉 |
This initial commit contains the first steps for converting from Rcpp to cpp11
Now I'll convert file by file and push each time I've completed a file so we can review.