Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve speed of build_lines #2

Open
robitalec opened this issue Nov 29, 2018 · 2 comments
Open

improve speed of build_lines #2

robitalec opened this issue Nov 29, 2018 · 2 comments

Comments

@robitalec
Copy link
Member

build_lines is slower than I'd like and it affects dependent group_lines.

The split (split.data.table) is slower than working in a data.table by, but we can't return SpatialLines in the j. Any alternatives?

The rest of the time is spent on the SpatialLines function. Is there a performance incentive of switching from sp to sf?

@robitalec robitalec changed the title improve speed of build_lines improve speed of build_lines Nov 29, 2018
@robitalec robitalec added type: discussion and removed type: enhancement new features, improvements labels Nov 29, 2018
@FilipeamTeixeira
Copy link

From their website, the sf package should be faster than the sp, due to SpatialIndexing.
About the split.data.table do you mean split.data.frame?

@robitalec
Copy link
Member Author

Interesting about the SpatialIndexing. Could you share a link describing it?

I would want to do a full switch from sp to sf instead of depending on both of them. Have you noticed any challenges installing sf on different OS or other potential downsides of a switch?

Here's all the uses of sp:

./R/build_polys.R:147:    spPts <- sp::SpatialPointsDataFrame(
./R/build_polys.R:149:      proj4string = sp::CRS(projection),
./R/build_lines.R:138:        sp::SpatialLines(list(sp::Lines(sp::Line(
./R/build_lines.R:143:        proj4string = sp::CRS(projection))
./R/build_lines.R:145:      return(do.call(sp::rbind.SpatialLines, l))

I do mean split.data.table since input DT is a data.table, build_lines uses the internal data.table method (data.table:::split.data.table):

./R/build_lines.R:131

lst <- split(DT[dropRows, on = splitBy][!(dropped)][order(get(sortBy))],
             by = splitBy, sorted = TRUE)

This splitting may be avoided with a switch to sf, I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants