Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mergeable rustdoc cross-crate info #3662

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

EtomicBomb
Copy link

@EtomicBomb EtomicBomb commented Jun 20, 2024

This RFC is to enable merging cross-crate information (like the search index, index.html, source files index) from separate output directories, so that rustdoc can run in parallel in non-cargo build systems.

Currently, rustdoc updates these cross-crate files with the information from a new crate by reading the old version, adding the information for the current crate, and writing the same file back. This causes issues in build systems that disallow multiple build actions targeting the same output files. Cargo supports cross-crate information because it doesn't have this restriction, Buck2 and Bazel do not support cross-crate information, and Fuchsia has a doc step that runs rustdoc serially on 2700 crates.

Supporting parallel rustdoc with CCI as a native step in Buck2, Bazel, and GN (Fuchsia + Chrome) would require mergable cross-crate information, which is what this proposal is about. It adds new flags, --include-info-json, --write-info-json, --merge=none|read-write|write-only and --include-rendered-docs, to opt-in to these new features.

Rendered

@GuillaumeGomez
Copy link
Member

cc @rust-lang/rustdoc

@ehuss ehuss added the T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC. label Jun 20, 2024
@aDotInTheVoid
Copy link
Member

text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
@jsha
Copy link

jsha commented Jun 22, 2024

I'm very in favor of this! In addition to the parallel building use case, I think it will make it much more feasible to support documentation of namespaced crates as a group. We discussed this a bit on Zulip last month. And merging the parts after doc generation is generally a nicer design.

For implementation details: I'd like for the cross crate info to be written in JSON, not JS, and for the merge step to generate the JS. The JS as it stands is just a very thin callback wrapper around JSON. We do this so docs can be loaded from file:// URLs (where fetching JSON files is not supported). But we don't need the merge step to understand the JS.

Some bikeshedding about the flags: I don't like the names --write-rendered-cci and --read-rendered-cci because they're long, because they contain an acronym that will be opaque to all but rustdoc implementers (cci), and because it's not clear to me that all four combinations of (true, false) values are valid or useful.

It seems to me that the merge step is effectively a subcommand, except that since rustdoc doesn't have subcommands it's triggered by a combination of --write-rendered-cci and --fetch-parts. Is that right?

Could you add a section to the RFC about compatibility? For compatibility on disk, I'm assuming we won't support merging cross crate info that was generated by disparate versions of rustdoc.

As an alternate design, what about this:

  • unconditionally start generating a crate-info.json in the root of any crate's doc build.
  • crate-info contains all the different kinds of potentially cross crate info (src files, implementers, etc)
  • merge step reads crate-info.json from N different crates and generates .js in the appropriate places
  • New flag: --merge=[auto|none|<list of paths>]
  • auto is the default value, and causes the equivalent of the current behavior. At the end of each run, rustdoc takes a lock and reads/writes from the out dir to do merge.
  • none is the future default. Individual rustdoc invocations do no merging; it is up to the build system to do a final merge step.
    - <list of paths> causes the merge step to be invoked and no other work to be done. Causes most other rustdoc flags to be ignored.

@notriddle
Copy link
Contributor

notriddle commented Jun 22, 2024

I would prefer not to put large files in the target/doc folder that aren't actually used by the frontend. People should be able to upload that folder to their webhost without having to filter out subfolders to avoid wasting space.

Instead, I'd design the CLI with two parameters:

  • --write-info-json=PATH: When supplied, the crate info json (search index, search desc shards, search type param names, foreign trait impls, type aliases, and source code file tree) gets written to a JSON file supplied by PATH. Cargo will use a path that's somewhere in target/, but not in the target/doc folder, so publishers don't accidentally upload files to their web host that don't actually need to be there. Other build systems can use a location that happens to be convenient.
  • --include-info-json=PATH1,PATH2,...: When --write-info-json isn't supplied, the info files are written the same way they are now. If --include-info-json is supplied, rustdoc will additionally include the data from the supplied paths.

@EtomicBomb
Copy link
Author

EtomicBomb commented Jun 22, 2024

@jsha, I love that there are more use cases than we initially anticipated!

I agree that the intermediate cross-crate info should be written in JSON, and for the merge step to generate JS (and HTML where applicable)! The current proposal / WIP writes intermediate files as JSON (src-files-js, search-index-js, search-desc, all-crates, crates-index, type-impl, trait-impl) with --parts-out-dir=PATH. It renders the JS + HTML versions upon --write-rendered-cci (default). It reads to and appends to CCI in the doc root upon --read-rendered-cci (default), to match the current behavior of rustdoc. I'll put the JSON files into a single crates-info.json and change the names of the options so they make more sense.

Here are what the combinations of --read-rendered-cci and --write-rendered-cci meant. Yes, only some of them are useful, so I agree that a --merge=[] will make more sense.

  • --read-rendered-cci=false --write-rendered-cci=false --parts-out-dir=target/doc.parts: write the pre-rendered cross-crate information parts files. Merge these with another crates version with another invocation of rustdoc --fetch-parts <...>
  • --read-rendered-cci=false --write-rendered-cci=false (leaving out --parts-out-dir): Does not read or write any cross-crate information. The user is only interested in item docs.
  • --read-rendered-cci=false --write-rendered-cci=true: write the cross-crate information for the current crate to src-files.js, search-index.js, type.impl/**/*.js, ignoring the files that are already there. This user may have a hermetic build system that monitors reads.
  • --read-rendered-cci=true --write-rendered-cci=false: not useful - we only read CCI to append to existing CCI
  • --read-rendered-cci=true --write-rendered-cci=true: rustdoc's current behavior

I initially conceptualized merging as a subcommand, but you could also see it as an incremental enhancement: if you're calling rustdoc on one of your binaries or libraries, you still might want to link-in the crate-info.json for other crates, without an additional rustdoc invocation. For pure merges though, we could make something that looks more like a merge subcommand: maybe a third party tool that effectively merges into a dummy crate with an empty lib.rs?

It seems like --include-info-json=PATH1,PATH2,... is similar to what this proposal means by --fetch-parts. And --write-info-json=PATH is like --parts-out-dir, and that crate-info.json is like the concatenation of {src-files-js, search-index-js, search-desc, all-crates, crates-index, type-impl, trait-impl}.

Based off of @notriddle and @jsha's feedback, I will update the RFC + WIP with these items:

  • Replace {src-files-js, search-index-js, search-desc, all-crates, crates-index, type-impl, trait-impl} into a single file crate-info.json
  • Rename --fetch-parts to the suggested --include-info-json
  • Rename --parts-out-dir to the suggested --write-info-json
  • Create a flag --merge=auto to replace --read-rendered-cci=true --write-rendered-cci=true
  • Create a flag --merge=none to replace --read-rendered-cci=false --write-rendered-cci=false
  • Create a flag --merge=no-read to replace --read-rendered-cci=false --write-rendered-cci=true
  • Describe the compatibility promises: crate-info.json is only stable within a version of rustdoc. The CCI after this proposal is not byte-for-byte identical with those produced by previous versions of rustdoc,but should have the same effect when executed by a browser (modulo sorting order).

I'll have to work on these suggestions on Monday, but definitely give me more feedback. Thank you for engaging with this proposal!

@EtomicBomb
Copy link
Author

EtomicBomb commented Jun 25, 2024

Updated the RFC with the items mentioned in my previous comment. I chose the name --merge=write-only instead of --merge=no-read

rustdoc -Z unstable-options --crate-name=i --crate-type=lib --edition=2021 --enable-index-page --out-dir=i/target/doc --extern-html-root-url s=$(MERGED) --extern-html-root-url t=$(MERGED) --extern-html-root-url i=$(MERGED) --merge=write-only --include-info-json t=t/target/doc.parts --include-info-json s=s/target/doc.parts --extern t=t/target/libt.rmeta --extern s=s/target/libs.rmeta -L t/target i/src/lib.rs
```

Merge the docs with `cp`. This can be avoided if `--out-dir=$(MERGED)` is used for all of the rustdoc calls. We copy here to illustrate that documenting `s` is independent of documenting `t`, and could happen on separate machines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for the rustdoc merge invocation to do all copying too? This would allow the per-crate rustdoc invocations to avoid copying all shared resources in their own doc dirs, saving disk space. And in general it seems like it would make things easier for the build system.

Copy link
Author

@EtomicBomb EtomicBomb Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cp is the stand-in for the step that takes docs from external crates and collects them in a single web root. Depending on the location of the external crates, this step may look very different:

  • It may do nothing; Cargo generates the docs in a single doc/ folder.
  • cp or mv files from one location on the filesystem to another. This is shown in the example in the RFC.
  • scp, rsync, wget --recursive, a proprietary protocol, etc. This RFC is intended to enable CCI in crates that have been documented on separate machines.

Because of the wide variety of locations that external docs may be, I believe that the build system is best positioned to collect them into a single location.

When you pass the --merge=none flag to document the external crates, the shared resources are not generated. Only crate-info.json (--write-info-json) and the item docs will be generated for these crates. The shared resources are only generated when you pass --merge=auto (default) or --merge=write-only, which is only provided when documenting the index or root crate. There are not any redundant files to bloat the disk space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please update so that it is mentioned that rustdoc will handle the copying internally as we don't want to rely on external tools.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will update it so that rustdoc does the copying.

@EtomicBomb
Copy link
Author

EtomicBomb commented Jun 28, 2024

I updated the RFC + WIP implementation so that --include-info-json and --write-info-json take a path directly to a crate-info.json file.

The existing tests/rustdoc* suite passes my WIP implementation. That hopefully means that no changes will need to be made to tools that use rustdoc.

I hacked compiletest to pass --merge=none + --write-info-json to the auxiliary crates and --merge=write-only + --include-info-json to the main crate for every test. The tests/rustdoc* (except rustdoc-gui - have not run these) tests pass under this mode too. These tests were for my confidence, as I'm not looking to make any changes to compiletest.

@djkoloski
Copy link

If there are no further blockers for this, can we put this RFC into FCP?


# Guide-level explanation

In this example, there is a crate `t` which defines a trait `T`, and a crate `s` which defines a struct `S` that implements `T`. Our goal in this demo is for `S` to appear as in implementer in `T`'s docs, even if `s` and `t` are documented independently. This guide will be assuming that we want a crate `i` that serves as our documentation index. See the Unresolved questions section for ideas that do not require an index crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this example, there is a crate `t` which defines a trait `T`, and a crate `s` which defines a struct `S` that implements `T`. Our goal in this demo is for `S` to appear as in implementer in `T`'s docs, even if `s` and `t` are documented independently. This guide will be assuming that we want a crate `i` that serves as our documentation index. See the Unresolved questions section for ideas that do not require an index crate.
In this example, there is a crate `t` which defines a trait `T`, and a crate `s` which defines a struct `S` that implements `T`. Our goal in this demo is for `S` to appear as an implementer in `T`'s docs, even if `s` and `t` are documented independently. This guide will be assuming that we want a crate `i` that serves as our documentation index. See the Unresolved questions section for ideas that do not require an index crate.

│ └── lib.rs.html
├── doc.parts
│ └── t
│ │ └── crate-info.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it bugs me. 😆

Suggested change
│ └── crate-info.json
│ └── crate-info.json


When `read_rendered_cci` is active, rustdoc will look in the `--out-dir` for rendered cross-crate info files. These files will be used as the base. Any new parts that rustdoc generates with its current invocation and any parts fetched with `include-info-json` will be appended to these base files. When it is disabled, the cross-crate info files start empty and are populated with the current crate's info and any crates fetched with `--include-info-json`.

* `--merge=auto` (`read_rendered_cci && write_rendered_cci`) is the default, and reflects the current behavior of rustdoc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of auto. Why not read-write instead? Would be more explicit (took me a while to get why it was different to write-only 😅).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good!


This RFC does not alter previous compatibility guarantees made about the output of rustdoc. In particular it does not stabilize the presence of the rendered cross-crate information files, their content, or the HTML generated by rustdoc.

In the same way that the [rustdoc HTML output is unstable](https://rust-lang.github.io/rfcs/2963-rustdoc-json.html#:~:text=The%20HTML%20output%20of%20rustdoc,into%20a%20different%20format%20impractical), the contents of the `crate-info.json` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `crate-info.json` should be expected. Only the presence of a `crate-info.json` file is promised, under `--write-info-json`. Merging cross-crate information generated by disparate versions of rustdoc is not supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the same way that the [rustdoc HTML output is unstable](https://rust-lang.github.io/rfcs/2963-rustdoc-json.html#:~:text=The%20HTML%20output%20of%20rustdoc,into%20a%20different%20format%20impractical), the contents of the `crate-info.json` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `crate-info.json` should be expected. Only the presence of a `crate-info.json` file is promised, under `--write-info-json`. Merging cross-crate information generated by disparate versions of rustdoc is not supported.
In the same way that the [rustdoc HTML output is unstable](https://rust-lang.github.io/rfcs/2963-rustdoc-json.html#:~:text=The%20HTML%20output%20of%20rustdoc,into%20a%20different%20format%20impractical), the content of the `crate-info.json` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `crate-info.json` should be expected. Only the presence of a `crate-info.json` file is promised, under `--write-info-json`. Merging cross-crate information generated by disparate versions of rustdoc is not supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say:

Merging cross-crate information generated by disparate versions of rustdoc is not supported.

But in the description of the content of the file, you don't mention that the version of rustdoc is stored or how it's stored.

Copy link
Author

@EtomicBomb EtomicBomb Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggestion about how to version the crate-info.json files? We could put the semver as a key in the JSON and accept an --info-json-version flag. Rustdoc would fail if it is passed an --include-info-json with an incompatible file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a tricky question. Currently the rustdoc JSON output has a simple counter but it's something we're thinking about changing, but we didn't find a new versioning system yet (you can read about it here).

So really not sure what to pick, but this is definitely a blocker for this RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that an incremental single-digit version number makes the most sense. The RFC only discusses the case where crate-info.json files are written by rustdoc, and consumed by the same version rustdoc, so it doesn't seem important to detail what constitutes a breaking change. Also the content of the files is tied to the internal details of rustdoc (including the details of the search index). I'll make rustdoc write an incremental single-digit version to the crate-info.json file, and rustdoc will fail if it's called with an --include-info-json pointing to a crate-info.json with a version that does not match its internal version.

@jsha
Copy link

jsha commented Aug 12, 2024

First off, thanks for this well written and thorough doc. I found myself nodding along as I read it.

I realized I left an unstated assumption in my initial comment: I think this style of docs generation should definitely be adopted by Cargo. I think doing so may help with come of @camelid's complexity concerns, which I share. I think if we integrate this into Cargo and deprecate the old mode, we will actually reduce rustdoc complexity overall.

Specifically, as @camelid says:

  1. Document any crate using a shared directory with global mutable CCI that is continuously updated. The CCI file(s) contain all crates' data in a flat structure. This is exactly how rustdoc currently works.

Our goal should be to support this mode temporarily, while we work on updating Cargo to support a "merge" step for cargo doc. Once that's in, we can deprecate this mode and eventually remove it.

One implication of Cargo adopting this mode is that we need rustdoc to be able to merge without creating an index crate, so that cargo doc in a workspace can work correctly. This is mentioned under Unresolved questions: Index crate?. I think it should be resolve in favor of "no index crate."

Note that there is prior art for a rustdoc mode that isn't crate specific: --emit=unversioned-shared-resources, which is used by docs.rs to avoid having a copy of main.js, rustdoc.css, and friends for each crate. Incidentally, I think we should modify Cargo to use this mode as well, so we have more consistency across different use cases. So Cargo would build each crate separately, then run two finishing steps, one to merge the cross-crate info and one to emit the shared resources. Am I right in assuming Buck2 and Bazel use --emit=unversioned-shared-resources already?

I proposed that rustdoc should unconditionally emit the cross-crate info into each crate's documentation directory. @notriddle replied:

I would prefer not to put large files in the target/doc folder that aren't actually used by the frontend. People should be able to upload that folder to their webhost without having to filter out subfolders to avoid wasting space.

I think the cross crate info file will always be small relative to the overall size of a crate's docs, and it's okay if it's stored alongside the rest of the docs. The benefit we get is that we don't need a new flag --parts-out-dir. And build systems have correspondingly less complexity integrating with rustdoc.


This mode only renders the HTML item documentation for the current crate. It does not produce a search index, cross-crate trait implementations, or an index page. It is expected that users follow this mode with 'Document a final crate' if these cross-crate features are desired.

In this mode, a user may specify a different `--out-dir` to every invocation of rustdoc. Additionally, a user will provide `--parts-out-dir=<path to crate-specific directory>` and `--merge=none` when documenting every crate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this mode, a user may specify a different `--out-dir` to every invocation of rustdoc. Additionally, a user will provide `--parts-out-dir=<path to crate-specific directory>` and `--merge=none` when documenting every crate.
In this mode, a user must specify a different `--out-dir` to each crate's rustdoc invocation. Additionally, a user will provide `--parts-out-dir=<path to crate-specific directory>` and `--merge=none` when documenting every crate.

I think this is "must", not "may", right?

Relatedly, in the --merge=shared mode, which is the equivalent of current behavior, it's worth mentioning that the build system "must" ensure two rustdoc invocations do not run concurrently with the same --out-dir.

Copy link
Author

@EtomicBomb EtomicBomb Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In many contexts, it's okay to run multiple invocations of rustdoc concurrently with the same --out-dir. Most of the files do not overlap (the item docs are separate). The files that are overwritten by multiple rustdoc invocations are written inside write_shared, which acquires an flock. In my understanding, cargo doc uses this when it launches multiple rustdoc processes at once.

text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
@notriddle
Copy link
Contributor

I think the cross crate info file will always be small relative to the overall size of a crate's docs, and it's okay if it's stored alongside the rest of the docs. The benefit we get is that we don't need a new flag --parts-out-dir. And build systems have correspondingly less complexity integrating with rustdoc.

If you don't expect the size overhead to be a problem, then this makes sense, and I support making the CLI simpler.

@EtomicBomb
Copy link
Author

EtomicBomb commented Aug 13, 2024

@jsha: I proposed that rustdoc should unconditionally emit the cross-crate info into each crate's documentation directory.

We should expect the size of crate-info to grow in proportion with the size of the crate's docs, since it includes all cross-crate trait implementations and the entire search index. For doc indexes with many crates, we should expect this file to be very large. @dtolnay describes how a search index for all of the crates in the doc index maintained by his team would be several gigabytes in size.

I think that build systems would strongly prefer for intermediate files to be separate from the output files. In Bazel, the rustdoc --out-dir is currently represented with ctx.actions.declare_directory. If we force doc.parts into the --out-dir, we can't tell Bazel that the --out-dir is actually a mix of intermediates and outputs. The names of the children of the --out-dir aren't known prior to rustdoc being run, because of #![crate_name = "..."]. The --out-dir has to be entirely opaque to the build system.

I think --parts-out-dir and --include-parts-dir are required in order to separate the intermediate state and the outputs.

* New directory: `doc.parts/<crate-name>/crate-info` -> New directory: `doc.parts`.
* Number of times `--parts-out-dir`, `--merge`, and `--include-parts-dir` can be provided are mentioned.
* Describe why `doc.parts` is a directory.
* Description of multiple parallel invocations of rustdoc.
* Describe why to run rustc before rustdoc.
* Fix typos.
@EtomicBomb
Copy link
Author

EtomicBomb commented Aug 13, 2024

Thank you for the thoughtful feedback, @jsha!

I changed the following in response:

  • New directory: doc.parts/<crate-name>/crate-info -> New directory: doc.parts.
  • Number of times --parts-out-dir, --merge, and --include-parts-dir can be provided are mentioned.
  • Describe why doc.parts is a directory.
  • Description of multiple parallel invocations of rustdoc.
  • Describe why to run rustc before rustdoc.
  • Fix typos.

Tomorrow, I will:

  • Resolve in favor of removing the requirement for an index crate. The biggest obstacle here is actually --enable-index-page, because the scaffold + markdown rendering aspect of the index page depends on the html::render::Context, the edition, and some other info gathered from the compiler and having a target crate. There are some implementation details to work out but I think it should be fine.
  • I will also describe that this RFC is intended to enable the future deprecation of --mode=shared.

* remove 'Unresolved: index crate?' from the unresolved questions section
* remove the requirement on the index crate from various sections
@EtomicBomb
Copy link
Author

I updated the RFC to resolve in favor of not needing an index crate to merge docs in a workspace, as explained above.

EtomicBomb added a commit to EtomicBomb/rust that referenced this pull request Aug 16, 2024
EtomicBomb added a commit to EtomicBomb/rust that referenced this pull request Aug 17, 2024
* Adds `--merge=shared|none|finalize` flags
* Adds `--parts-out-dir=<crate specific directory>` for `--merge=none`
  to write cross-crate info file for a single crate
* Adds `--include-parts-dir=<previously specified directory>` for
  `--merge=finalize` to write cross-crate info files
* Longer crate names in previous cross-crate-info tests
* Unit tests for above
@jsha
Copy link

jsha commented Aug 20, 2024

We should expect the size of crate-info to grow in proportion with the size of the crate's docs, since it includes all cross-crate trait implementations and the entire search index. For doc indexes with many crates, we should expect this file to be very large. @dtolnay describes how a search index for all of the crates in the doc index maintained by his team would be several gigabytes in size.

That's the merged search index, though - not each crate's search index.

I think that build systems would strongly prefer for intermediate files to be separate from the output files.

This is a good point. It would perhaps make sense to consider all of the regular rustdoc outputs to be intermediates, which are merged in the merge step. But maybe that introduces an excessive number of filesystem operations? At any rate, in the interests of unblocking the work I'm willing to concede this point and say the extra flag is okay.

Copy link

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added my approval on the FCP comment. Also one editorial nit below.

@@ -33,26 +35,20 @@ More details are in the Reference-level explanation.
* `--include-parts-dir=path/to/doc.parts/<crate-name>`: Include cross-crate information from this previously written `doc.parts` directories into a collection that will be written by the current invocation of rustdoc. May only be provided with `--merge=finalize`. May be provided any number of times.
* `--merge=none`: Do not write cross-crate information to the `--out-dir`. The flag `--parts-out-dir` may instead be provided with the destination of the current crate's cross-crate information parts.
* `--merge=shared` (default): Append information from the current crate to any info files found in the `--out-dir`.
* `--merge=finalize`: Write cross-crate information from the current crate and any crates included via `--include-parts-dir` to the `--out-dir`, overwriting conflicting files.
* `--merge=finalize`: Write cross-crate information from the current crate and any crates included via `--include-parts-dir` to the `--out-dir`, overwriting conflicting files. This flag may be used with or without an input crate root, as it can links existing docs.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `--merge=finalize`: Write cross-crate information from the current crate and any crates included via `--include-parts-dir` to the `--out-dir`, overwriting conflicting files. This flag may be used with or without an input crate root, as it can links existing docs.
* `--merge=finalize`: Write cross-crate information from the current crate and any crates included via `--include-parts-dir` to the `--out-dir`, overwriting conflicting files. This flag may be used with or without an input crate root, as it can link existing docs.

typo in --merge=finalize description: can links
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 20, 2024
modularize rustdoc's write_shared

Refactor src/librustdoc/html/render/write_shared.rs to reduce code duplication, adding unit tests

* Extract + unit test code for sorting and rendering JSON, which is duplicated 9 times in the current impl
* Extract + unit test code for encoding JSON as single quoted strings, which is duplicated twice in the current impl
* Unit tests for cross-crate information file formats
* Generic interface to add new kinds of cross-crate information files in the future
* Intended to match current behavior exactly, except for a merge info comment it adds to the bottom of cci files
* This PR is intended to reduce the review burden from my [mergeable rustdoc rfc](rust-lang/rfcs#3662) implementation PR, which is a [small commit based on this branch](https://github.com/EtomicBomb/rust/tree/rfc). This code is agnostic to the RFC and does not include any of the flags discussed there, but cleanly enables the addition of these flags in the future because it is more modular
Copy link
Member

@aDotInTheVoid aDotInTheVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked my FCP box, but have a few editorial comments and requests for clarification (where the answer doesn't make a difference to weather I'm happy with this, as long as it get's written down in the RFC itself)

text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved

In the same way that the [rustdoc HTML output is unstable](https://rust-lang.github.io/rfcs/2963-rustdoc-json.html#:~:text=The%20HTML%20output%20of%20rustdoc,into%20a%20different%20format%20impractical), the content of `doc.parts` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `doc.parts` should be expected. Only the presence of a `doc.parts` directory is promised, under `--parts-out-dir`. Merging cross-crate information generated by disparate versions of rustdoc is not supported. To detect whether `doc.parts` is compatible, rustdoc includes a version number in these files (see New directory: `doc.parts`).

The implementation of the RFC itself is designed to produce only minimal changes to cross-crate info files and the HTML output of rustdoc. Exhaustively, the implementation is allowed to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial: What is this talking about? There was a linked draft implementation in the PR, but that was on the master branch of your fork, which now doesn't have it. It's worth clairifying that this is your WIP implementation (unless it isn't), and linking to it.

Copy link
Author

@EtomicBomb EtomicBomb Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the reference. It was to a different RFC (#2963), but I don't think the reference is relevant anymore.

* full sentences in intro
* Clarify Cargo's expected use of `--merge=none|shared|finalize`
* Explain why you would use multiple rustdoc invocations with the same --out-dir
  and merge=none
* Remove justification for why rustdoc output can be unstable in general
* Permalink specific fuchsia commit
Copy link
Member

@camelid camelid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfcbot reviewed
@rfcbot resolve complexity

Thanks! I really like the way this has turned out. I just have one non-essential suggestion.

* `--merge=none`: Do not write cross-crate information to the `--out-dir`. The flag `--parts-out-dir` may instead be provided with the destination of the current crate's cross-crate information parts.
* `--parts-out-dir=path/to/doc.parts/<crate-name>`: Write cross-crate linking information to the given directory (only usable with the `--merge=none` mode). This information allows linking the current crate's documentation with other documentation at a later rustdoc invocation.
* `--include-parts-dir=path/to/doc.parts/<crate-name>`: Include cross-crate information from this previously written `doc.parts` directories into a collection that will be written by the current invocation of rustdoc. May only be provided with `--merge=finalize`. May be provided any number of times.
* `--merge=shared` (default): Append information from the current crate to any info files found in the `--out-dir`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we plan to deprecate --merge=shared anyway, what do you think about not adding it in the first place? --merge already has to be an optional flag to avoid breaking changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uniformity and ease of documentation. It's the same reason you're allowed to write edition = 2015, even though that's the default if you don't specify anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I guess we won't ever be able to remove the shared implementation anyway, so it doesn't matter. And if we somehow did find a way to remove it while maintaining the outward behavior, that wouldn't be a breaking change anyway since it's an implementation detail.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you say we won't ever be able to remove the shared implementation? I've been assuming that after updating Cargo and docs.rs, and providing a decently long deprecation period we could remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm trying to remember what I was thinking ^^. I think I was worried that there might be existing build systems or other tools that were depending on the shared representation, but we never guaranteed stability of it. So perhaps it would be fine to remove after all.

text/0000-mergeable-rustdoc-cross-crate-info.md Outdated Show resolved Hide resolved
@camelid
Copy link
Member

camelid commented Aug 21, 2024

@rfcbot reviewed
@rfcbot resolve complexity

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Aug 21, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Aug 21, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Aug 21, 2024
Change sentence reading "The WIP may change the sorting order..." to "The implementation may change the sorting order..."

Co-authored-by: Noah Lev <camelidcamel@gmail.com>
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Aug 26, 2024
modularize rustdoc's write_shared

Refactor src/librustdoc/html/render/write_shared.rs to reduce code duplication, adding unit tests

* Extract + unit test code for sorting and rendering JSON, which is duplicated 9 times in the current impl
* Extract + unit test code for encoding JSON as single quoted strings, which is duplicated twice in the current impl
* Unit tests for cross-crate information file formats
* Generic interface to add new kinds of cross-crate information files in the future
* Intended to match current behavior exactly, except for a merge info comment it adds to the bottom of cci files
* This PR is intended to reduce the review burden from my [mergeable rustdoc rfc](rust-lang/rfcs#3662) implementation PR, which is a [small commit based on this branch](https://github.com/EtomicBomb/rust/tree/rfc). This code is agnostic to the RFC and does not include any of the flags discussed there, but cleanly enables the addition of these flags in the future because it is more modular
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Aug 29, 2024
modularize rustdoc's write_shared

Refactor src/librustdoc/html/render/write_shared.rs to reduce code duplication, adding unit tests

* Extract + unit test code for sorting and rendering JSON, which is duplicated 9 times in the current impl
* Extract + unit test code for encoding JSON as single quoted strings, which is duplicated twice in the current impl
* Unit tests for cross-crate information file formats
* Generic interface to add new kinds of cross-crate information files in the future
* Intended to match current behavior exactly, except for a merge info comment it adds to the bottom of cci files
* This PR is intended to reduce the review burden from my [mergeable rustdoc rfc](rust-lang/rfcs#3662) implementation PR, which is a [small commit based on this branch](https://github.com/EtomicBomb/rust/tree/rfc). This code is agnostic to the RFC and does not include any of the flags discussed there, but cleanly enables the addition of these flags in the future because it is more modular
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Aug 31, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Aug 31, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@camelid
Copy link
Member

camelid commented Sep 21, 2024

We should discuss this before stabilization. In the meantime, this RFC needs to get actually merged, so I'll do that now. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.