Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge.xts does not error if NA occurs at the end of index #174

Closed
joshuaulrich opened this issue Mar 30, 2017 · 0 comments
Closed

merge.xts does not error if NA occurs at the end of index #174

joshuaulrich opened this issue Mar 30, 2017 · 0 comments
Assignees
Labels

Comments

@joshuaulrich
Copy link
Owner

merge.xts() should throw an error if there are any NA in the index, and it does most of the time. Strangely, it "works" if the NA is at the end of the index, and the xts object was created with the .xts() constructor.

R> endNA <- c(1:4, NA)
R> x <- .xts(1:5, endNA)
R> merge(x, x)
                    x x.1
1969-12-31 18:00:01 1   1
1969-12-31 18:00:02 2   2
1969-12-31 18:00:03 3   3
1969-12-31 18:00:04 4   4
<NA>                5   5
R> x <- xts(1:5, .POSIXct(endNA))
R> merge(x, x)
nan, nan
Error in merge.xts(x, x) : 'NA' not allowed in 'index'

Fixing #173 will avoid this entirely, but it should still be investigated in case another bug is lurking in the darkness...

@joshuaulrich joshuaulrich self-assigned this Mar 30, 2017
joshuaulrich added a commit that referenced this issue Mar 31, 2017
There were cases where merge.xts would not throw an error if the index
contained NA values.

One case was if the index was integer, because the section of code that
checked for integer NA was referencing the real_*index references
instead of int_*index.

After fixing that bug, the issue still remained for integer indexes.
Now the issue is that NA_INTEGER = INT_MIN, so logical operations still
work. But they don't give the results we want. Fix the issue by
checking for NA_INTEGER before any other logical comparisons.

Added similar checks for the real/double index case as well, and also
moved the check to the top of the if/else branch. Note that the test
now uses R_FINITE to check for NaN and +/- as well.

I really do not like these checks in the first position of the if/else
branch because they should never happen in user space. For performance,
the most commonly true if statements should come first. Need to look
for an upstream place to check for NA in the index.

Fixes #174.
joshuaulrich added a commit that referenced this issue May 16, 2017
There were cases where merge.xts would not throw an error if the index
contained NA values.

One case was if the index was integer, because the section of code that
checked for integer NA was referencing the real_*index references
instead of int_*index.

After fixing that bug, the issue still remained for integer indexes.
Now the issue is that NA_INTEGER = INT_MIN, so logical operations still
work. But they don't give the results we want. Fix the issue by
checking for NA_INTEGER before any other logical comparisons.

Added similar checks for the real/double index case as well, and also
moved the check to the top of the if/else branch. Note that the test
now uses R_FINITE to check for NaN and +/-Inf as well.

I really do not like these checks in the first position of the if/else
branch because they should never happen in user space. For performance,
the most commonly true if statements should come first. Need to look
for an upstream place to check for NA in the index.

Fixes #174.
@joshuaulrich joshuaulrich added this to the Release 0.10-0 milestone May 16, 2017
joshuaulrich added a commit that referenced this issue Jun 7, 2017
Checking for NA, NaN, and +/-Inf values in the first position of the
if-else statement was a bad idea for performance reasons, but was also
incorrect.

Any time two merged objects had different start or end index values,
xp or yp ar outside the bounds of their respective index vectors. That
didn't cause a segfault though, because R_FINITE() would happily
interpret the random memory as either a number or NaN. This would
manifest as an error when there should be none, or an infinite loop.

We can fix the logic and improve performance by taking advantage of the
fact that the indexes are sorted. Therefore, if present, -Inf will
always be in the first element, while NA, Inf, and NaN will always be
in the last element.

In the case of an integer index, the NA will always be last (because
that's how it sorts in R, even though it's equal to INT_MIN in C).

See #174.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant