Introduce convert_chunk_header() to convert fenced chunk head to block chunk option #2149

cderv · 2022-07-11T16:52:57Z

This PR introduces a new function to allow a user to convert all chunks to new syntax.
https://yihui.org/en/2022/01/knitr-news/

First supported conversion is to multiline R chunk option, but YAML chunk header could be supported next.

To introduce the feature, I refactor some small pieces of already existing parsing code in some utility function.

Target format is one option per line.

@yihui opening this PR so that you can already give some thoughts and feedback

You mentioned getParseData() would be useful to get one per line, but it seems I manage to have something working without it. Did I miss anything ?
For the label part, do we want unexplicit or explicit ?

#| label = "first-chunk",
#| echo = FALSE

or

#| first-chunk,
#| echo = FALSE

For now, I did the first;

What behavior do we want for the function:
- Overwrite by default ?
- Output in console ?
- Ask interactively if overwrite should happen ?

Also for checks, I am aiming the function to .Rmd and .qmd for this. I think that is ok.

I need to test a bit more, and also see if I could derive YAML syntax from this implementation

yihui

You mentioned getParseData() would be useful to get one per line, but it seems I manage to have something working without it. Did I miss anything ?

For the label part, do we want unexplicit or explicit ?

I see that you parsed the options. In that case, there is no need to use getParseData(). Originally I was thinking of simply extracting the chunk options as a single string, and then applying strwrap(prefix = '#| '). If users want one option per line, we'll need to figure out which commas are true separators, e.g., getParseData(parse(text = 'alist(foo, bar=1, baz="a,b,c")')) will tell you.

I feel you don't have to start with Rmd but make a general converter (Rmd does need a tiny special treatment, though). What I'd do is:

Read the file.
detect_pattern() and determine the regular expression chunk.begin from all_patterns.
Extract the chunk options and strwrap() them.
Append the wrapped #| options and write out the content.

We can consider the two cases later in different PRs: one option per line (comma-separated), and converting to YAML.

What behavior do we want for the function:

Have an argument output = NULL, which means writing output to console. Users can specify the output file path, in which case we write to that file. Alternatively, it can take a function that takes the input value and returns a file path, e.g., output = identity means overwriting the original input file.

* detect pattern on file extension * simplify for now and use strwrap() * correctly handle output feature: default to console and allow identity for overwriting

cderv · 2022-07-12T11:05:51Z

Thanks for your feedback. That was mainly my idea at first, but I was not sure if making it generic would be expected.

applying strwrap(prefix = '#| ')

I have redone a solution with that. I did that at first, but what I don't like is that it is completely splitting in the middle of the code. I don't think anyone would use like that - they would split on comma. Example for long chunks

```{verbatim}
#| lang = "markdown", code =
#| stringr::str_trim(knitr::knit_child("examples/py-engine.Rmd",
#| quiet = TRUE))
```

from

```{verbatim, lang = "markdown", code = stringr::str_trim(knitr::knit_child("examples/py-engine.Rmd", quiet = TRUE))}
```

Have an argument output = NULL, which means writing output to console. Users can specify the output file path, in which case we write to that file. Alternatively, it can take a function that takes the input value and returns a file path, e.g., output = identity means overwriting the original input file.

I always forgot about this trick of using a function. So now output can be

NULL which will write to console
a function which will apply on input
a character which will be the output file path.

yihui

Looks good to me overall. Thanks!

R/parser.R

as splitting parsed_data on commas does not get nice results. It will also split on comma within R code used.

cderv · 2022-07-15T17:08:12Z

@yihui after giving more thoughts and trying the getParseData() way, I switched back to parse_params() with calling deparse(). It seems to me it give either formatting that trying to split the result of getParseData() on commas. I ended selecting some commas in the middle of some function or list, and did not like that.

This way seems cleaner to me. What do you think ?

There is still some room to improve line wrapping for multiline type so that very long option gets split on several lines too with the comment pipe (#|) added.

You can review once more maybe, then I'll document on monday, and do further test. We can add YAML support later.

`wrap_width` can be set to FALSE to deactivate wrapping when `type = "multiline"`

cderv · 2022-07-19T13:29:14Z

@yihui I just saw that Rnw file does not have an engine unlike Rmd. Is engine setup only for .Rmd document ?
As we are supposed to handle different pattern with detect_pattern, I guess we need to account for all the syntax.

It seems I need to take this into account:

knitr/R/parser.R

Lines 73 to 80 in ca398c2

    
           parse_block = function(code, header, params.src, markdown_mode = out_format('markdown')) { 
        
             params = params.src 
        
             engine = 'r' 
        
             # consider the syntax ```{engine, opt=val} for chunk headers 
        
             if (markdown_mode) { 
        
               engine = sub('^([a-zA-Z0-9_]+).*$', '\\1', params) 
        
               params = sub('^([a-zA-Z0-9_]+)', '', params) 
        
             }

Probably doing

knitr/R/parser.R

Line 15 in ca398c2

markdown_mode = identical(patterns, all_patterns$md)

right ?

But if no engine, what are we suppose to use inside comment for none markdown document ?
For Rnw, this is %| I believe, (comment_chars$tikz), but what are the other non-Rmd extension supported by knitr.
Rhtml ?

I guess we have all of this to support

names(knitr:::all_patterns)
#> [1] "rnw"      "brew"     "tex"      "html"     "md"       "rst"      "asciidoc"
#> [8] "textile"

Thanks

This reverts commit 240e790.

Chunk does not have an engine. They defaults to `r` engine for the comment char. (based on `partition_chunk` implementation.

cderv · 2022-07-19T14:09:33Z

Ok. In fact based on partition_chunk I think I got it right by assuming not engine for non markdown doc and assigning engine = 'r' to get associated comment character

cderv · 2022-07-19T15:06:37Z

Test in knitr-examples went well

files <- fs::dir_ls(regexp = "[0-9].*\\.(brew|R(nw|md|tex|html|rst|asciidoc|textile))")
purrr::walk(files, ~ { message(.x) ; convert_chunk_header(.x, output = identity, type = "multiline")})

and same for type = 'wrap'

I added a powershell script in knitr example to help me test everything. yihui/knitr-examples@f2cc2eb

yihui

I made some cosmetic changes, and I think it's ready to merge now. Thanks!

I might give type = "yaml" a try later.

cderv · 2022-07-21T18:47:31Z

Thanks !

cderv added 2 commits July 11, 2022 15:02

Introduce convert_chunk_header() to convert to multiline R options

daead33

Correctly write expression to character

273a11a

cderv linked an issue Jul 11, 2022 that may be closed by this pull request

Feature request: function to convert chunk options to YAML-style #2082

Closed

3 tasks

yihui reviewed Jul 11, 2022

View reviewed changes

rewrite to take into feedback

0f5c541

* detect pattern on file extension * simplify for now and use strwrap() * correctly handle output feature: default to console and allow identity for overwriting

yihui approved these changes Jul 12, 2022

View reviewed changes

R/parser.R Outdated Show resolved Hide resolved

R/parser.R Outdated Show resolved Hide resolved

yihui reviewed Jul 12, 2022

View reviewed changes

R/parser.R Outdated Show resolved Hide resolved

cderv added 8 commits July 15, 2022 17:25

detect_pattern() uses lowercase extensions

db027c4

Use getParseData() to split by commas

767b45d

remove duplicates

072f09b

Keep indentation

ee084b0

Go back to use parse_params()

3cbe1fd

as splitting parsed_data on commas does not get nice results. It will also split on comma within R code used.

Merge branch 'master' into convert-chunk-header

07577d5

Add width parameter for type 'wrap'

5ae11c4

missing comma

66c6436

cderv added 4 commits July 15, 2022 19:11

correct calculation of indentation

fd75f9a

Also wrap each line in mutiline mode when a single option is too long

da86fca

`wrap_width` can be set to FALSE to deactivate wrapping when `type = "multiline"`

Document the new function

821c775

Add examples

0d7e945

cderv added 3 commits July 19, 2022 16:03

Support each formats

240e790

Revert "Support each formats"

3129fa5

This reverts commit 240e790.

Support non Rmd format

a17e102

Chunk does not have an engine. They defaults to `r` engine for the comment char. (based on `partition_chunk` implementation.

ignore brew files

236dc26

cderv marked this pull request as ready for review July 19, 2022 15:13

cderv requested a review from yihui July 19, 2022 15:19

yihui added 2 commits July 21, 2022 09:39

default to yaml for quarto

bf9a3a3

mostly cosmetic changes

ea6c089

yihui approved these changes Jul 21, 2022

View reviewed changes

yihui merged commit 553b692 into master Jul 21, 2022

yihui deleted the convert-chunk-header branch July 21, 2022 15:30

IndrajeetPatil mentioned this pull request Jul 22, 2022

Using in-body syntax for knitr chunk options creates problems for pkgdown in presence of child documents r-lib/pkgdown#2168

Closed

cderv mentioned this pull request Jul 25, 2022

Convert chunk option to YAML #2151

Merged

zkamvar mentioned this pull request Dec 6, 2022

Add support for chunk options via knitr 1.35 ropensci/tinkr#55

Open

3 tasks

github-actions bot locked as resolved and limited conversation to collaborators Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce convert_chunk_header() to convert fenced chunk head to block chunk option #2149

Introduce convert_chunk_header() to convert fenced chunk head to block chunk option #2149

cderv commented Jul 11, 2022

yihui left a comment

cderv commented Jul 12, 2022

yihui left a comment

cderv commented Jul 15, 2022 •

edited

Loading

cderv commented Jul 19, 2022 •

edited

Loading

cderv commented Jul 19, 2022

cderv commented Jul 19, 2022

yihui left a comment

cderv commented Jul 21, 2022

Introduce convert_chunk_header() to convert fenced chunk head to block chunk option #2149

Introduce convert_chunk_header() to convert fenced chunk head to block chunk option #2149

Conversation

cderv commented Jul 11, 2022

yihui left a comment

Choose a reason for hiding this comment

cderv commented Jul 12, 2022

yihui left a comment

Choose a reason for hiding this comment

cderv commented Jul 15, 2022 • edited Loading

cderv commented Jul 19, 2022 • edited Loading

cderv commented Jul 19, 2022

cderv commented Jul 19, 2022

yihui left a comment

Choose a reason for hiding this comment

cderv commented Jul 21, 2022

cderv commented Jul 15, 2022 •

edited

Loading

cderv commented Jul 19, 2022 •

edited

Loading