Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

na.locf incorrect when first column is entirely NA #233

Closed
TomAndrews opened this issue Apr 18, 2018 · 5 comments
Closed

na.locf incorrect when first column is entirely NA #233

TomAndrews opened this issue Apr 18, 2018 · 5 comments
Labels

Comments

@TomAndrews
Copy link
Contributor

Description

New multi column na.locf doesn't work correctly when the first column is entirely NA. In this case, no filling is performed at all.

Expected behavior

Subsequent columns should be filled.

Minimal, reproducible example

library(xts)
test <- .xts(matrix(c(NA,NA, 1, NA), 2, 2), 1:2)
na.locf(test)
                     [,1] [,2]
 1970-01-01 01:00:01   NA    1
 1970-01-01 01:00:02   NA   NA
na.locf(as.zoo(test))

 1970-01-01 01:00:01 NA 1
 1970-01-01 01:00:02 NA 1

Session Info

R version 3.4.4 (2018-03-15)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Debian GNU/Linux 9 (stretch)

 Matrix products: default
 BLAS: /usr/lib/libblas/libblas.so.3.7.0
 LAPACK: /usr/lib/lapack/liblapack.so.3.7.0

 locale:
  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] xts_0.10-2.1 zoo_1.8-1

 loaded via a namespace (and not attached):
 [1] compiler_3.4.4  tools_3.4.4     grid_3.4.4      lattice_0.20-35

I think the problem is here:
https://github.com/joshuaulrich/xts/blob/master/src/na.c#L161

This checks whether the first non na is the first value in the second column and if so, returns.

PS Looking forward to this making into the stable release - seems to speed up some of my use cases massively. Thanks!

@joshuaulrich
Copy link
Owner

Thanks a ton for using and troubleshooting the development code! It is an enormously valuable and helpful contribution to the project.

Are you willing/able to add tests and attempt to fix?

@TomAndrews
Copy link
Contributor Author

You're welcome. Thanks for developing it in the first place!

Locally I think I fixed it by just deleting the check I mentioned. I can try to add a test but I need to get to grips with RUnit...

@joshuaulrich
Copy link
Owner

Right, simply deleting that check makes sense. I believe that is left-over from when the function only operated on univariate objects.

I'm not sure what you're doing to 'get to grips with RUnit'... I would hope you could copy and modify one of the existing tests. In this case, you should be able to modify the test I put in the comment to the first issue you opened.

test.nalocf_first_column_all_NA <- function() {
  nacol <- 1L
  for (m in MODES) {
    xdat <- XDAT2
    xdat[,nacol] <- xdat[,nacol] * NA
    storage.mode(xdat) <- m
    zdat <- as.zoo(xdat)

    x <- na.locf(xdat)
    z <- na.locf(zdat)
    checkEquals(x, as.xts(z), check.attributes = TRUE)
  }
}

@TomAndrews
Copy link
Contributor Author

I'm struggling to work out how to run the tests easily. Any pointers gratefully received.

I can do R CMD build then R CMD check on the tar. That seems to give a different result to evaluating bits of the file in the test directory.

@joshuaulrich
Copy link
Owner

joshuaulrich commented Apr 18, 2018

For what it's worth, I trust the results of R CMD build ... && R CMD check ... more than running tests outside of that pattern. Likely because I sometimes forget part(s) of the relevant process for building/installing a package, but also because that's what CRAN does. That said, it would be nice to have something like make test [args] that could run quickly, so you could iterate quickly.

I know RStudio / devtools / testthat have ways to run individual tests, but I have encountered too many instances of users being confused because an updated package didn't load correctly when the package was already loaded in the current session.

One thing you could do is something like this:

Rscript -e 'library(xts); library(RUnit); source("xts/inst/unitTests/runit.na.locf.R"); test.nalocf_last_column_all_NA()`

Sorry I don't have a better answer...

TomAndrews added a commit to TomAndrews/xts that referenced this issue Apr 18, 2018
The firstNonNA check only checks the first column for a 2-dimensional
SEXP.  Therefore if the first column is NA, the function returns
without examining any other columns.

Fix by removing this check.

Fixes joshuaulrich#233
@joshuaulrich joshuaulrich added this to the Release 0.11-0 milestone Jul 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants