Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing more rows than default max has unexpected behavior #734

Open
szimmer opened this issue Apr 3, 2023 · 4 comments
Open

Printing more rows than default max has unexpected behavior #734

szimmer opened this issue Apr 3, 2023 · 4 comments

Comments

@szimmer
Copy link

szimmer commented Apr 3, 2023

If I have a summary which is long, I sometimes want to add print(n=bignumber) to the end to get all of the summary to print but this doesn't behave as I would expect. Below, I would expect Example 1 and Example 2 to print the same thing but they don't. This is more important when I have even more rows as is true in example 3 and 4. It seems n is being ignored altogether

Here's a reprex to show:

library(skimr)
#> Warning: package 'skimr' was built under R version 4.2.3
library(tidyverse)

sum_cut <- diamonds %>% group_by(cut) %>% skim(carat) %>% yank("numeric")

sum_cut_color <- diamonds %>% group_by(cut, color) %>% skim(carat) %>% yank("numeric")

sum_cut #example 1

Variable type: numeric

skim_variable cut n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
carat Fair 0 1 1.05 0.52 0.22 0.70 1.00 1.20 5.01 ▇▂▁▁▁
carat Good 0 1 0.85 0.45 0.23 0.50 0.82 1.01 3.01 ▇▆▂▁▁
carat Very Good 0 1 0.81 0.46 0.20 0.41 0.71 1.02 4.00 ▇▃▁▁▁
carat Premium 0 1 0.89 0.52 0.20 0.41 0.86 1.20 4.01 ▇▆▁▁▁
carat Ideal 0 1 0.70 0.43 0.20 0.35 0.54 1.01 3.50 ▇▂▁▁▁
sum_cut %>% print(n=50) # example 2
#> 
#> ── Variable type: numeric ──────────────────────────────────────────────────────
#>   skim_variable cut    n_mis…¹ compl…²  mean    sd   p0  p25  p50  p75 p100 hist
#> 1 carat         Fair         0       1 1.05  0.516 0.22 0.7  1    1.2  5.01 ▇▂▁…
#> 2 carat         Good         0       1 0.849 0.454 0.23 0.5  0.82 1.01 3.01 ▇▆▂…
#> 3 carat         Very …       0       1 0.806 0.459 0.2  0.41 0.71 1.02 4    ▇▃▁…
#> 4 carat         Premi…       0       1 0.892 0.515 0.2  0.41 0.86 1.2  4.01 ▇▆▁…
#> 5 carat         Ideal        0       1 0.703 0.433 0.2  0.35 0.54 1.01 3.5  ▇▂▁…
#> # … with abbreviated variable names ¹​n_missing, ²​complete_rate

sum_cut_color # example 3

Variable type: numeric

skim_variable cut color n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
carat Fair D 0 1 0.92 0.41 0.25 0.70 0.90 1.01 3.40 ▆▇▁▁▁
carat Fair E 0 1 0.86 0.36 0.22 0.55 0.90 1.01 2.04 ▇▇▇▂▁
carat Fair F 0 1 0.90 0.42 0.25 0.60 0.90 1.01 2.58 ▇▇▂▁▁
carat Fair G 0 1 1.02 0.49 0.23 0.70 0.98 1.07 2.60 ▅▇▂▂▁
carat Fair H 0 1 1.22 0.55 0.33 0.90 1.01 1.51 4.13 ▇▃▂▁▁
carat Fair I 0 1 1.20 0.52 0.41 0.88 1.01 1.50 3.02 ▇▇▃▂▁
carat Fair J 0 1 1.34 0.73 0.30 0.90 1.03 1.69 5.01 ▇▃▁▁▁
carat Good D 0 1 0.74 0.36 0.23 0.42 0.70 1.00 2.04 ▇▅▅▁▁
carat Good E 0 1 0.75 0.38 0.23 0.41 0.70 1.00 3.00 ▇▅▁▁▁
carat Good F 0 1 0.78 0.37 0.23 0.49 0.71 1.01 2.67 ▇▆▁▁▁
carat Good G 0 1 0.85 0.43 0.23 0.50 0.90 1.01 2.80 ▇▇▂▁▁
carat Good H 0 1 0.91 0.50 0.25 0.51 0.90 1.09 3.01 ▇▇▂▁▁
carat Good I 0 1 1.06 0.58 0.30 0.70 1.00 1.50 3.01 ▇▆▃▂▁
carat Good J 0 1 1.10 0.54 0.28 0.71 1.02 1.50 3.00 ▇▇▅▂▁
carat Very Good D 0 1 0.70 0.37 0.23 0.40 0.61 1.00 2.58 ▇▅▁▁▁
carat Very Good E 0 1 0.68 0.38 0.20 0.37 0.57 0.94 2.51 ▇▆▁▁▁
carat Very Good F 0 1 0.74 0.39 0.23 0.40 0.70 1.01 2.48 ▇▇▂▁▁
carat Very Good G 0 1 0.77 0.42 0.23 0.40 0.70 1.02 2.52 ▇▆▂▁▁
carat Very Good H 0 1 0.92 0.50 0.23 0.47 0.90 1.20 3.00 ▇▇▂▁▁
carat Very Good I 0 1 1.05 0.55 0.24 0.70 1.00 1.50 4.00 ▇▆▂▁▁
carat Very Good J 0 1 1.13 0.56 0.24 0.71 1.06 1.51 2.74 ▇▇▆▃▁
carat Premium D 0 1 0.72 0.40 0.20 0.40 0.58 1.01 2.57 ▇▅▂▁▁
carat Premium E 0 1 0.72 0.41 0.20 0.38 0.58 1.00 3.05 ▇▃▁▁▁
carat Premium F 0 1 0.83 0.42 0.20 0.43 0.76 1.04 3.01 ▇▆▂▁▁
carat Premium G 0 1 0.84 0.48 0.23 0.40 0.76 1.12 3.01 ▇▆▂▁▁
carat Premium H 0 1 1.02 0.54 0.23 0.51 1.01 1.30 3.24 ▇▇▃▁▁
carat Premium I 0 1 1.14 0.61 0.23 0.59 1.14 1.54 4.01 ▇▇▃▁▁
carat Premium J 0 1 1.29 0.61 0.30 0.81 1.25 1.70 4.01 ▇▇▃▁▁
carat Ideal D 0 1 0.57 0.30 0.20 0.33 0.50 0.71 2.75 ▇▂▁▁▁
carat Ideal E 0 1 0.58 0.31 0.20 0.33 0.50 0.72 2.28 ▇▂▁▁▁
carat Ideal F 0 1 0.66 0.37 0.23 0.35 0.53 0.90 2.45 ▇▃▁▁▁
carat Ideal G 0 1 0.70 0.41 0.23 0.34 0.54 1.03 2.54 ▇▃▂▁▁
carat Ideal H 0 1 0.80 0.49 0.23 0.36 0.70 1.11 3.50 ▇▅▁▁▁
carat Ideal I 0 1 0.91 0.55 0.23 0.41 0.74 1.22 3.22 ▇▃▂▁▁
carat Ideal J 0 1 1.06 0.58 0.23 0.54 1.03 1.41 3.01 ▇▆▃▂▁
sum_cut_color %>% print(n=50) # example 4
#> 
#> ── Variable type: numeric ──────────────────────────────────────────────────────
#>    skim_…¹ cut color n_mis…² compl…³  mean    sd   p0   p25   p50  p75 p100 hist
#>  1 carat   Fa… D           0       1 0.920 0.405 0.25 0.7   0.9   1.01 3.4  ▆▇▁…
#>  2 carat   Fa… E           0       1 0.857 0.365 0.22 0.552 0.9   1.01 2.04 ▇▇▇…
#>  3 carat   Fa… F           0       1 0.905 0.419 0.25 0.6   0.9   1.01 2.58 ▇▇▂…
#>  4 carat   Fa… G           0       1 1.02  0.493 0.23 0.7   0.98  1.07 2.6  ▅▇▂…
#>  5 carat   Fa… H           0       1 1.22  0.548 0.33 0.9   1.01  1.51 4.13 ▇▃▂…
#>  6 carat   Fa… I           0       1 1.20  0.522 0.41 0.885 1.01  1.50 3.02 ▇▇▃…
#>  7 carat   Fa… J           0       1 1.34  0.734 0.3  0.905 1.03  1.68 5.01 ▇▃▁…
#>  8 carat   Go… D           0       1 0.745 0.363 0.23 0.42  0.7   1    2.04 ▇▅▅…
#>  9 carat   Go… E           0       1 0.745 0.381 0.23 0.41  0.7   1    3    ▇▅▁…
#> 10 carat   Go… F           0       1 0.776 0.370 0.23 0.49  0.71  1.01 2.67 ▇▆▁…
#> 11 carat   Go… G           0       1 0.851 0.433 0.23 0.5   0.9   1.01 2.8  ▇▇▂…
#> 12 carat   Go… H           0       1 0.915 0.498 0.25 0.51  0.9   1.09 3.01 ▇▇▂…
#> 13 carat   Go… I           0       1 1.06  0.576 0.3  0.7   1     1.5  3.01 ▇▆▃…
#> 14 carat   Go… J           0       1 1.10  0.537 0.28 0.71  1.02  1.5  3    ▇▇▅…
#> 15 carat   Ve… D           0       1 0.696 0.369 0.23 0.4   0.61  1    2.58 ▇▅▁…
#> 16 carat   Ve… E           0       1 0.676 0.378 0.2  0.37  0.57  0.94 2.51 ▇▆▁…
#> 17 carat   Ve… F           0       1 0.741 0.389 0.23 0.4   0.7   1.01 2.48 ▇▇▂…
#> 18 carat   Ve… G           0       1 0.767 0.418 0.23 0.4   0.7   1.02 2.52 ▇▆▂…
#> 19 carat   Ve… H           0       1 0.916 0.503 0.23 0.467 0.9   1.2  3    ▇▇▂…
#> 20 carat   Ve… I           0       1 1.05  0.552 0.24 0.7   1.00  1.5  4    ▇▆▂…
#> 21 carat   Ve… J           0       1 1.13  0.556 0.24 0.71  1.06  1.51 2.74 ▇▇▆…
#> 22 carat   Pr… D           0       1 0.722 0.397 0.2  0.4   0.58  1.01 2.57 ▇▅▂…
#> 23 carat   Pr… E           0       1 0.718 0.410 0.2  0.38  0.58  1    3.05 ▇▃▁…
#> 24 carat   Pr… F           0       1 0.827 0.420 0.2  0.43  0.76  1.04 3.01 ▇▆▂…
#> 25 carat   Pr… G           0       1 0.841 0.480 0.23 0.4   0.755 1.12 3.01 ▇▆▂…
#> 26 carat   Pr… H           0       1 1.02  0.544 0.23 0.51  1.01  1.3  3.24 ▇▇▃…
#> 27 carat   Pr… I           0       1 1.14  0.614 0.23 0.59  1.14  1.54 4.01 ▇▇▃…
#> 28 carat   Pr… J           0       1 1.29  0.614 0.3  0.81  1.25  1.7  4.01 ▇▇▃…
#> 29 carat   Id… D           0       1 0.566 0.299 0.2  0.33  0.5   0.71 2.75 ▇▂▁…
#> 30 carat   Id… E           0       1 0.578 0.313 0.2  0.33  0.5   0.72 2.28 ▇▂▁…
#> 31 carat   Id… F           0       1 0.656 0.375 0.23 0.35  0.53  0.9  2.45 ▇▃▁…
#> 32 carat   Id… G           0       1 0.701 0.411 0.23 0.34  0.54  1.03 2.54 ▇▃▂…
#> 33 carat   Id… H           0       1 0.800 0.487 0.23 0.36  0.7   1.11 3.5  ▇▅▁…
#> 34 carat   Id… I           0       1 0.913 0.554 0.23 0.41  0.74  1.22 3.22 ▇▃▂…
#> 35 carat   Id… J           0       1 1.06  0.582 0.23 0.54  1.03  1.41 3.01 ▇▆▃…
#> # … with abbreviated variable names ¹​skim_variable, ²​n_missing, ³​complete_rate

Created on 2023-04-03 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31 ucrt)
#>  os       Windows 10 x64 (build 19045)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United States.utf8
#>  ctype    English_United States.utf8
#>  tz       America/New_York
#>  date     2023-04-03
#>  pandoc   2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package       * version date (UTC) lib source
#>  assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.2.2)
#>  backports       1.4.1   2021-12-13 [1] CRAN (R 4.2.0)
#>  base64enc       0.1-3   2015-07-28 [1] CRAN (R 4.2.0)
#>  broom           1.0.1   2022-08-29 [1] CRAN (R 4.2.2)
#>  cellranger      1.1.0   2016-07-27 [1] CRAN (R 4.2.2)
#>  cli             3.4.1   2022-09-23 [1] CRAN (R 4.2.2)
#>  colorspace      2.0-3   2022-02-21 [1] CRAN (R 4.2.2)
#>  crayon          1.5.2   2022-09-29 [1] CRAN (R 4.2.2)
#>  DBI             1.1.3   2022-06-18 [1] CRAN (R 4.2.2)
#>  dbplyr          2.2.1   2022-06-27 [1] CRAN (R 4.2.2)
#>  digest          0.6.30  2022-10-18 [1] CRAN (R 4.2.2)
#>  dplyr         * 1.1.0   2023-01-29 [1] CRAN (R 4.2.2)
#>  ellipsis        0.3.2   2021-04-29 [1] CRAN (R 4.2.2)
#>  evaluate        0.18    2022-11-07 [1] CRAN (R 4.2.2)
#>  fansi           1.0.3   2022-03-24 [1] CRAN (R 4.2.2)
#>  fastmap         1.1.0   2021-01-25 [1] CRAN (R 4.2.2)
#>  forcats       * 0.5.2   2022-08-19 [1] CRAN (R 4.2.2)
#>  fs              1.5.2   2021-12-08 [1] CRAN (R 4.2.2)
#>  gargle          1.2.1   2022-09-08 [1] CRAN (R 4.2.2)
#>  generics        0.1.3   2022-07-05 [1] CRAN (R 4.2.2)
#>  ggplot2       * 3.4.0   2022-11-04 [1] CRAN (R 4.2.2)
#>  glue            1.6.2   2022-02-24 [1] CRAN (R 4.2.2)
#>  googledrive     2.0.0   2021-07-08 [1] CRAN (R 4.2.2)
#>  googlesheets4   1.0.1   2022-08-13 [1] CRAN (R 4.2.2)
#>  gtable          0.3.1   2022-09-01 [1] CRAN (R 4.2.2)
#>  haven           2.5.1   2022-08-22 [1] CRAN (R 4.2.2)
#>  highr           0.9     2021-04-16 [1] CRAN (R 4.2.2)
#>  hms             1.1.2   2022-08-19 [1] CRAN (R 4.2.2)
#>  htmltools       0.5.3   2022-07-18 [1] CRAN (R 4.2.2)
#>  httr            1.4.4   2022-08-17 [1] CRAN (R 4.2.2)
#>  jsonlite        1.8.4   2022-12-06 [1] CRAN (R 4.2.2)
#>  knitr           1.40    2022-08-24 [1] CRAN (R 4.2.2)
#>  lifecycle       1.0.3   2022-10-07 [1] CRAN (R 4.2.2)
#>  lubridate       1.9.0   2022-11-06 [1] CRAN (R 4.2.2)
#>  magrittr        2.0.3   2022-03-30 [1] CRAN (R 4.2.2)
#>  modelr          0.1.9   2022-08-19 [1] CRAN (R 4.2.2)
#>  munsell         0.5.0   2018-06-12 [1] CRAN (R 4.2.2)
#>  pillar          1.8.1   2022-08-19 [1] CRAN (R 4.2.2)
#>  pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.2.2)
#>  purrr         * 1.0.1   2023-01-10 [1] CRAN (R 4.2.2)
#>  R.cache         0.16.0  2022-07-21 [1] CRAN (R 4.2.2)
#>  R.methodsS3     1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo            1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils         2.12.2  2022-11-11 [1] CRAN (R 4.2.2)
#>  R6              2.5.1   2021-08-19 [1] CRAN (R 4.2.2)
#>  readr         * 2.1.3   2022-10-01 [1] CRAN (R 4.2.2)
#>  readxl          1.4.1   2022-08-17 [1] CRAN (R 4.2.2)
#>  repr            1.1.6   2023-01-26 [1] CRAN (R 4.2.3)
#>  reprex          2.0.2   2022-08-17 [1] CRAN (R 4.2.2)
#>  rlang           1.0.6   2022-09-24 [1] CRAN (R 4.2.2)
#>  rmarkdown       2.17    2022-10-07 [1] CRAN (R 4.2.2)
#>  rstudioapi      0.14    2022-08-22 [1] CRAN (R 4.2.2)
#>  rvest           1.0.3   2022-08-19 [1] CRAN (R 4.2.2)
#>  scales          1.2.1   2022-08-20 [1] CRAN (R 4.2.2)
#>  sessioninfo     1.2.2   2021-12-06 [1] CRAN (R 4.2.2)
#>  skimr         * 2.1.5   2022-12-23 [1] CRAN (R 4.2.3)
#>  stringi         1.7.8   2022-07-11 [1] CRAN (R 4.2.1)
#>  stringr       * 1.5.0   2022-12-02 [1] CRAN (R 4.2.2)
#>  styler          1.8.1   2022-11-07 [1] CRAN (R 4.2.2)
#>  tibble        * 3.1.8   2022-07-22 [1] CRAN (R 4.2.2)
#>  tidyr         * 1.3.0   2023-01-24 [1] CRAN (R 4.2.2)
#>  tidyselect      1.2.0   2022-10-10 [1] CRAN (R 4.2.2)
#>  tidyverse     * 1.3.2   2022-07-18 [1] CRAN (R 4.2.2)
#>  timechange      0.1.1   2022-11-04 [1] CRAN (R 4.2.2)
#>  tzdb            0.3.0   2022-03-28 [1] CRAN (R 4.2.2)
#>  utf8            1.2.2   2021-07-24 [1] CRAN (R 4.2.2)
#>  vctrs           0.5.2   2023-01-23 [1] CRAN (R 4.2.2)
#>  withr           2.5.0   2022-03-03 [1] CRAN (R 4.2.2)
#>  xfun            0.34    2022-10-18 [1] CRAN (R 4.2.2)
#>  xml2            1.3.3   2021-11-30 [1] CRAN (R 4.2.2)
#>  yaml            2.3.6   2022-10-18 [1] CRAN (R 4.2.1)
#> 
#>  [1] C:/Program Files/R/R-4.2.2/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@elinw
Copy link
Collaborator

elinw commented Apr 4, 2023

Thanks for this report. Can you clarify what you mean by printing "the same thing"?

sum_cut and sum_cut_color are both class

[1] "one_skim_df" "tbl_df" "tbl"
[4] "data.frame"

sum_cut has 5 rows while sum_cut_color is 35 rows.
sum_cut has has 12 columns, sum_cut_color has 13.

So you wouldn't expect them to be identical when they print because they are not identical objects.

Can you clarify what you mean by expecting them to print the same thing?

@szimmer
Copy link
Author

szimmer commented Apr 4, 2023

One prints as a nice html table and one does not. The n parameter on print also seems to be ignored

@elinw
Copy link
Collaborator

elinw commented Apr 5, 2023

I think it's the use of the print() function I don't see how they should be impacted by the value of n. When I use n=3 I get 3 rows so I think the parameter is not ignored. However it does seem like knit_print.skim_one_df() is not being used.

However, using knit_print() instead of print() seems to solve that.
See:

https://rpubs.com/elinw/1024699

@michaelquinn32 It seems like something is going wrong with the print dispatch? Or is calling print explicitly essentially an override?

@elinw
Copy link
Collaborator

elinw commented May 19, 2023

I did notice that things seemed to go a bit wild with several hundred variables all of the same type. Maybe we need to be thinking about pagination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants