Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when non-unique endpoints are passed to period.apply #171

Closed
joshuaulrich opened this issue Mar 17, 2017 · 2 comments
Closed

Error when non-unique endpoints are passed to period.apply #171

joshuaulrich opened this issue Mar 17, 2017 · 2 comments
Assignees
Labels

Comments

@joshuaulrich
Copy link
Owner

A subsetting error will be thrown if there are duplicate entries in the INDEX argument to period.apply(). This is possible if users create custom end points without using the endpoints() function.

x <- .xts(1:10, 1:10)
ep <- c(2, 4, 6, 8, 10)
period.apply(x, c(0, ep, nrow(x)), mean)
# Error in `[.xts`(x, (INDEX[y] + 1):INDEX[y + 1]) : 
#   subscript out of bounds

Possibly worse, you get incorrect results if INDEX is not sorted.

set.seed(21)
EP <- sample(head(ep, -1))
merge(unsorted = period.apply(x, c(0, EP, nrow(x)), mean),
      sorted = period.apply(x, c(0, sort(EP), nrow(x)), mean))
#                     unsorted sorted
# 1969-12-31 18:00:02      4.5    1.5
# 1969-12-31 18:00:04      5.5    3.5
# 1969-12-31 18:00:06      3.5    5.5
# 1969-12-31 18:00:08      5.5    7.5
# 1969-12-31 18:00:10      8.5    9.5

The simplest solution is probably to call xts::isOrdered(INDEX) before looping over it. This solves both problems because isOrdered() returns FALSE if x has duplicate entries. If isOrdered() returns FALSE, we should throw a warning and then call sort and unique on INDEX before proceeding.

We could also prevent any performance hit during internal calls to period.apply() (e.g. apply.daily) by checking for a specific argument passed via ... (e.g. _internal = TRUE). Using a non-syntactic name would prevent accidental matching with arguments intended for FUN.

@joshuaulrich joshuaulrich self-assigned this Mar 17, 2017
joshuaulrich added a commit that referenced this issue Mar 17, 2017
period.apply() expects INDEX to be sorted and unique. Check for those
characteristics. If they're not present, then sort INDEX and ensure it
is unique.

See #171.
joshuaulrich added a commit that referenced this issue Mar 17, 2017
period.apply() expects INDEX to start with zero and end with the
number of observations in 'x'. Check those values are present, and
add them if they are not.

See #171.
@joshuaulrich
Copy link
Owner Author

The two commits fix the issue, but I'm going to leave this open until I can do some performance benchmarking.

@joshuaulrich
Copy link
Owner Author

Closing without adding benchmarks. Correctness first, then performance.

@joshuaulrich joshuaulrich added this to the Release 0.10-0 milestone Jun 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant