Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

points.xts doesn't add points with duplicate index values or when the index does not match the original plot #360

Closed
vspinu opened this issue Nov 30, 2021 · 11 comments
Milestone

Comments

@vspinu
Copy link

vspinu commented Nov 30, 2021

Repeated index does not add points:

time0 <- as.POSIXct("2021-11-30 11:24:44.003")
tx <- xts(data.frame("c"=1:5,"c1"=2:6),  time0 + c(1,1,1,2,2))
plot.xts(tx[c(1, 4), "c"])
points(tx[, "c1"], col = "red", pch = 19)

image

Then I thought it picks only first occurence, so let me jitter the index a bit, but it turned out that it would not print anything if the the new index is not exactly the same as the original index:

plot.xts(tx[, "c"])
tp <- xts(data.frame("c1" = 2:6), time0 + c(1.1, 1.2, 1.3, 1.9, 1.95))
points(tp[, "c1"], col = "red", pch = 19)

image

```r sessionInfo() R version 4.0.0 (2020-04-24) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS: /home/vspinu/bin/R-4.0.0-bin/lib/libRblas.so
LAPACK: /home/vspinu/bin/R-4.0.0-bin/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.utf8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] graphics grDevices utils datasets stats methods base

other attached packages:
[1] xts_0.12.1 zoo_1.8-8 iroll_0.0.0.9000 lubridate_1.8.0
[5] data.table_1.14.2 purrr_0.3.4 glue_1.4.2 VSR_0.0.0.9000
[9] rlang_0.4.11 ggplot2_3.3.3 dplyr_1.0.7

loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 rstudioapi_0.13 magrittr_2.0.1 tidyselect_1.1.1
[5] munsell_0.5.0 lattice_0.20-41 colorspace_1.4-1 R6_2.5.1
[9] fansi_0.5.0 grid_4.0.0 gtable_0.3.0 utf8_1.2.2
[13] DBI_1.1.1 withr_2.4.2 ellipsis_0.3.2 assertthat_0.2.1
[17] tibble_3.1.5 lifecycle_1.0.1 crayon_1.4.1.9000 vctrs_0.3.8
[21] compiler_4.0.0 pillar_1.6.3 generics_0.1.0 scales_1.1.1
[25] pkgconfig_2.0.3

</details>
@joshuaulrich
Copy link
Owner

joshuaulrich commented Nov 30, 2021

I think the problem is that points() can't currently add observations to the plot if they're at timestamps that aren't in the initial object.

Can you describe what you're trying to accomplish? Maybe I can help with a work-around.

@vspinu
Copy link
Author

vspinu commented Nov 30, 2021

I want to plot multiple points per timestamp. More technically, I have a data frame of measurements. One columns is a list of numeric vectors. I want to plot those measurements from each vector as points. Each of the points within a vector appears on the same vertical line.

[Edit]: My current workaround is to call points() multiple times, first with all 1st observations from the vectors, then 2nd etc. Those vectors can be of different length.

@joshuaulrich
Copy link
Owner

This seems to work with quantmod::chart_Series(), which is what plot.xts() is based on. Can you confirm this is the behavior you expect? If so, this is a bug.

   time0 <- as.POSIXct("2021-11-30 11:24:44.003") 
   tx <- xts(data.frame("c"=1:5,"c1"=2:6),  time0 + c(1,1,1,2,2)) 
   quantmod::chart_Series(tx[c(1, 4), "c"]) 
   quantmod::add_TA(tx[, "c1"], col = "red", pch = 19, type = "p", on = 1) 

image

@joshuaulrich
Copy link
Owner

Related to #216.

@vspinu
Copy link
Author

vspinu commented Dec 25, 2021

Yes, this is precisely what I want. Thanks for looking into it!

joshuaulrich added a commit that referenced this issue Dec 26, 2021
I'm not sure why this was set to 'fixed = TRUE', since chart_Series()
sets 'fixed = FALSE'. This argument controls whether or not updates
to the frame/panel can change the y-axis.

See #360.
@joshuaulrich
Copy link
Owner

These commits fix most cases, but they break one case in issue 216:

# Create some data
# 3 monthly data
x1 <- seq(from = as.Date('2000-01-01'), to = as.Date('2010-12-31'), by = 90)
n <- length(x1)
y1 <- (c(1:n) - n / 2)^3
xts1 <- xts(y1, order.by = x1)
# 2 monthly data
x2 <- seq(from = as.Date('2000-01-01'), to = as.Date('2010-12-31'), by = 60)
y2 <- (c(1:length(x2)) - n / 2)^3
xts2 <- xts(y2, order.by = x2)


plot(xts1)                 # works
points(xts2, col = "red")  # works
points(xts1)               # Error in xy.coords(x, y) : 'x' and 'y' lengths differ

I've spent several hours on this so far, but I still don't feel like I'm close to a complete solution. This is hard because plot.xts() has the observation.based = TRUE feature that chart_Series() doesn't. That feature gives you an equally spaced x-axis regardless of the time between each observation. That requires keeping the $Env$xycoords updated for every series on the plotting device. That functionality doesn't exist right now...

joshuaulrich added a commit that referenced this issue Jan 1, 2022
Add a 'get_xcoords()' function and wait to set the x-coordinates until
the plot is drawn. That allows us to add series that have new index
values. It handles the cases when observation.based is either TRUE or
FALSE. You can pass an xts object as an argument in order to get the
x-coordinates that correspond to the observations in the object.

Add new functionality to '$update_frames()':

  1. The update_axis() function finds the unique index values across
     all the series that will be drawn, and then
  2. The update_lenv_data() function updates the series in each panel
     to make sure they all have at least one observation at every
     index value.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
We were only drawing grid lines using the index from xdata[xsubset],
but we should use the combined index from every series in the plot.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
Only the first observation would be plotted when the series had
duplicate index values. That's because merge() only matches the first
value when there are duplicate index values. The rest of the values
for each set of duplicates are set to NA.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
We should not subset the index for any series if the user doesn't
provide a subset value. Otherwise it's not possible for a new series
to extend the x-axis.

See #360.
@joshuaulrich
Copy link
Owner

Okay, several more hours of work, and this seems to be working. Now one timestamp can have multiple values, whether observation.based is TRUE or FALSE.

You're also able to add a series that has index values:

  • before the first value in the original series,
  • after the last value in the original series,
  • within the original series, but at different timestamps

Some examples I used for testing/validation:

odd  <- c(1, 30, 500, 570, 610)
even <- odd + 15
even[1] <- 3
x <- xts( 1:5, Sys.Date() + odd)
y <- xts(6:10, Sys.Date() + even)
plot(x, observation.based = FALSE)
points(y)

image

# continued from the chunk above
plot(x, observation.based = TRUE)
points(y)

image

# your example with observation.based = TRUE
time0 <- as.POSIXct("2021-11-30 11:24:44.003")
tx <- xts(data.frame("c"=1:5,"c1"=2:6),  time0 + c(1,1,1,2,2))
plot(tx[c(1, 4), "c"], observation.based = TRUE)
points(tx[, "c1"], col = "red", pch = 19)

image

# your example where the last observation of the new series is
# after the last value of the original series, and
# observation.based = TRUE
time0 <- as.POSIXct("2021-11-30 11:24:44.003")
tx <- xts(data.frame("c"=1:5,"c1"=2:6),  time0 + c(1,1,1,2,3))
plot(tx[c(1, 4), "c"], observation.based = TRUE)
points(tx[, "c1"], col = "red", pch = 19)

image

@joshuaulrich
Copy link
Owner

Bah, and I just noticed that the x-axis labels are wrong for a couple of those. Back to work...

joshuaulrich added a commit that referenced this issue Jan 1, 2022
Add a 'get_xcoords()' function and wait to set the x-coordinates until
the plot is drawn. That allows us to add series that have new index
values. It handles the cases when observation.based is either TRUE or
FALSE. You can pass an xts object as an argument in order to get the
x-coordinates that correspond to the observations in the object.

Add new functionality to '$update_frames()':

  1. The update_axis() function finds the unique index values across
     all the series that will be drawn, and then
  2. The update_lenv_data() function updates the series in each panel
     to make sure they all have at least one observation at every
     index value.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
We were only drawing grid lines using the index from xdata[xsubset],
but we should use the combined index from every series in the plot.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
Only the first observation would be plotted when the series had
duplicate index values. That's because merge() only matches the first
value when there are duplicate index values. The rest of the values
for each set of duplicates are set to NA.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
We should not subset the index for any series if the user doesn't
provide a subset value. Otherwise it's not possible for a new series
to extend the x-axis.

See #360.
joshuaulrich added a commit that referenced this issue Jan 1, 2022
We always need the POSIXct values to create and format x-axis labels,
even when 'observation.based = TRUE'. Add 'at_posix = FALSE' argument
so we can always get the POSIXct values.

See #360.
@joshuaulrich
Copy link
Owner

Okay, the x-axis labels should be fixed in commit 'Optionally return POSIXct values from get_xcoords()'. @vspinu, please test and let me know if you find any issues.

@vspinu
Copy link
Author

vspinu commented Jan 9, 2022

@joshuaulrich sorry for coming late on this. I have just checked and it works as expected and it's very fast. Thanks a bunch for a quick fix!

@joshuaulrich
Copy link
Owner

This branch needs another update.

The current behavior does not extend the x-axis if you add a new series, so this current state of this branch may break existing code.

We should still add this functionality, but it should be optional. We also need to make sure any changes to this branch keep the current behavior to address this issue and #216.

joshuaulrich added a commit that referenced this issue Feb 6, 2022
This branch adds several new features.

  1. Series added to a panel now extend the panel's y-axis

  2. All series added to any panel of a plot now update the x-axis
     of all panels. So the entire plot's x-axis will include every
     series' time index values within the original plot's time range.

  3. Adding a series can now extend the x-axis before/after the plot's
     existing time index range, so all of the new series' time index
     values are included in the plot. This is FALSE by default to
     maintain backward compatibility.

  4. Both y-values will be plotted for series with duplicate index
     values that have different data values.

Closes #360.
@joshuaulrich joshuaulrich changed the title points.xts does not add points with repeated index or when index does not much with the original plot points.xts doesn't add points with duplicate index values or when the index does not match the original plot Oct 5, 2022
@joshuaulrich joshuaulrich added this to the 0.12.2 milestone Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants