Error when non-unique endpoints are passed to period.apply #171

joshuaulrich · 2017-03-17T14:29:23Z

A subsetting error will be thrown if there are duplicate entries in the INDEX argument to period.apply(). This is possible if users create custom end points without using the endpoints() function.

x <- .xts(1:10, 1:10)
ep <- c(2, 4, 6, 8, 10)
period.apply(x, c(0, ep, nrow(x)), mean)
# Error in `[.xts`(x, (INDEX[y] + 1):INDEX[y + 1]) : 
#   subscript out of bounds

Possibly worse, you get incorrect results if INDEX is not sorted.

set.seed(21)
EP <- sample(head(ep, -1))
merge(unsorted = period.apply(x, c(0, EP, nrow(x)), mean),
      sorted = period.apply(x, c(0, sort(EP), nrow(x)), mean))
#                     unsorted sorted
# 1969-12-31 18:00:02      4.5    1.5
# 1969-12-31 18:00:04      5.5    3.5
# 1969-12-31 18:00:06      3.5    5.5
# 1969-12-31 18:00:08      5.5    7.5
# 1969-12-31 18:00:10      8.5    9.5

The simplest solution is probably to call xts::isOrdered(INDEX) before looping over it. This solves both problems because isOrdered() returns FALSE if x has duplicate entries. If isOrdered() returns FALSE, we should throw a warning and then call sort and unique on INDEX before proceeding.

We could also prevent any performance hit during internal calls to period.apply() (e.g. apply.daily) by checking for a specific argument passed via ... (e.g. _internal = TRUE). Using a non-syntactic name would prevent accidental matching with arguments intended for FUN.

The text was updated successfully, but these errors were encountered:

period.apply() expects INDEX to be sorted and unique. Check for those characteristics. If they're not present, then sort INDEX and ensure it is unique. See #171.

period.apply() expects INDEX to start with zero and end with the number of observations in 'x'. Check those values are present, and add them if they are not. See #171.

joshuaulrich · 2017-03-17T22:51:28Z

The two commits fix the issue, but I'm going to leave this open until I can do some performance benchmarking.

joshuaulrich · 2017-06-06T20:55:52Z

Closing without adding benchmarks. Correctness first, then performance.

joshuaulrich added the bug label Mar 17, 2017

joshuaulrich self-assigned this Mar 17, 2017

joshuaulrich added a commit that referenced this issue Mar 17, 2017

Check for non-unique/sorted INDEX

4da6e3a

period.apply() expects INDEX to be sorted and unique. Check for those characteristics. If they're not present, then sort INDEX and ensure it is unique. See #171.

joshuaulrich added a commit that referenced this issue Mar 17, 2017

Ensure first/last values of INDEX are valid

927e25c

period.apply() expects INDEX to start with zero and end with the number of observations in 'x'. Check those values are present, and add them if they are not. See #171.

joshuaulrich closed this as completed Jun 6, 2017

joshuaulrich added this to the Release 0.10-0 milestone Jun 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when non-unique endpoints are passed to period.apply #171

Error when non-unique endpoints are passed to period.apply #171

joshuaulrich commented Mar 17, 2017

joshuaulrich commented Mar 17, 2017

joshuaulrich commented Jun 6, 2017

Error when non-unique endpoints are passed to period.apply #171

Error when non-unique endpoints are passed to period.apply #171

Comments

joshuaulrich commented Mar 17, 2017

joshuaulrich commented Mar 17, 2017

joshuaulrich commented Jun 6, 2017