Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: simplify URLs #35020

Closed
wants to merge 2 commits into from
Closed

rustdoc: simplify URLs #35020

wants to merge 2 commits into from

Conversation

nrc
Copy link
Member

@nrc nrc commented Jul 25, 2016

Changes rustdoc URLs to name.namespace.html, e.g., Foo.t.html. These are easier for clients to guess since they only need to know the namespace, not the kind of item.

Old URLs are preserved as redirects to the new ones.

I also add redirects for modules, e.g.,
foo/bar/baz.t.html to foo/bar/baz/index.html, so modules are not an exception to the URL rule.

And removes the ! from macro URLs.

Closes #34271

r? @alexcrichton

cc @steveklabnik

@GuillaumeGomez
Copy link
Member

Nice work! However he'll have to do a huuuuuuuuuuuuuuuuuge update in doc urls. Poor us. :'(

@steveklabnik
Copy link
Member

A few questions:

  1. So it's always t, v, or m for "type" "value" and "macro"?
  2. I'm finding these then a bit more cryptic to read. I'm assuming it's t and not type just to keep it short?
  3. At first, my lack-of-caffinated-brain went "why are functions values and not types", which is, after a moment of reflection, obvious. But while this makes it easier for tools, I wonder how much humans will be a bit unsure of how to do it.

Overall, 👍

@GuillaumeGomez
Copy link
Member

So it's always t, v, or m for "type" "value" and "macro"?

And what happens if it's a module? :-/

@nrc
Copy link
Member Author

nrc commented Jul 25, 2016

@GuillaumeGomez a module is kind of a special case in that it gets foo/bar/index.html, however, to keep it uniform I added a redirect from foo/bar.t.html since modules are in the type namespace.

@nrc
Copy link
Member Author

nrc commented Jul 25, 2016

@GuillaumeGomez also, there are redirects in place, so you the update shouldn't be too bad

@nrc
Copy link
Member Author

nrc commented Jul 25, 2016

@steveklabnik

1: yes
2: yes. If there is consensus that type, value, macro are better, then I don't mind changing to that.
3: yeah, there are also a few places where items are in both namespaces where I just picked one. I'd be pretty surprised if more than a few people write their own rustdoc URLs manually. I imagine that the overlap between people who do that and the people who understand namespaces is pretty high. And the old scheme was kind of weird too - e.g., when is a function a function vs a method vs a typemethod?

@alexcrichton
Copy link
Member

Awesome, thanks @nrc! Some thoughts of mine:

  • We probably, sooner rather than later, want to turn off the redirects, right? In doing this we probably want to disable generation of the redirects, fix all linkchecker errors in our repo, then re-enable the redirects at least.
  • Ideally we want Foo.html to be the "main page", I wonder if we could possibly get that to work? If there's no collision in the namespaces we could do that, and we could also emit Foo.t.html as a redirect to Foo.html in that case. If there was a collision, maybe Foo.html could have a descriptive "which do you want" page with Foo.t.html being the actual page. That being said, this sounds pretty complicated.
  • Can you upload a copy of the docs somewhere so others can poke around as well? Would be good to just double check things like search/links work!

Overall I very much like the direction this is going!

@nrc
Copy link
Member Author

nrc commented Jul 25, 2016

The Travis test is probably spurious, apparently.

@nrc
Copy link
Member Author

nrc commented Jul 25, 2016

@alexcrichton

  • I wasn't sure how long we want to leave the redirects around - I don't see much of a downside for keeping them, except that it feels bad. How do I run linkchecker? I passed make check without the redirects, but I assume this was not running linkchecker.
  • I'm kind of hesitant about eliding namespace where possible - it makes it unpredictable to synthesise the URL and means URLs are not back compat in the same way as code. (The former problem is solved by redirects, the latter isn't, though I guess if you want a permanent URL you could use the explicit namespace and the namespace-less one is always a redirect or choice). I can file an issue for this and we can maybe do it as a followup.
  • Sure, will do so a bit later.

@alexcrichton
Copy link
Member

@nrc the linkchecker lives in src/tools/linkchecker and you can just compile it and run ./linkchecker path/to/doc I believe.

Also yeah I wasn't thinking of eliding namespaces aggressively, we'd still always have <name>.<namespace>.<html>. I was just thing that aesthetically we could get it so you almost never interact with that.

@ollie27
Copy link
Member

ollie27 commented Jul 25, 2016

A few problems I've noticed so far:

  • There is a name conflict between the docs for primitive types and the modules sharing the same name. For example u32 is documented at std/u32.t.html but it's overwritten by the redirect for std::u32.
  • There isn't any backwards compatibility for link fragments. Links like /std/vec/struct.Vec.html#method.new need to keep working.
  • The ids for enum variants and struct fields aren't updated yet so the search results for those are broken.
  • There's a name conflict for redirects in some rare cases like:
pub const v: u32 = 0;
pub fn constant() {}
  • rustdoc\inline_local\issue-32343.rs fails on Windows because of the case insensitive file system. Specifically // @!has issue_32343/Bar.t.html fails because of the redirect issue_32343/bar.t.html for the module.

@nrc
Copy link
Member Author

nrc commented Jul 26, 2016

re the case sensitivity issue, what does Rustdoc do at the moment? E.g., if there are two structs in the same module called foo and Foo does it just silently fail on Windows?

@ollie27
Copy link
Member

ollie27 commented Jul 26, 2016

Yeah it just silently fails: #25879. This change does make that problem a bit worse of course.

@nrc
Copy link
Member Author

nrc commented Jul 27, 2016

@alexcrichton what do you think about "v/t" vs "value/type" ?

@alexcrichton
Copy link
Member

@nrc I prefer "v/t" as it's not something we really want to draw attention to in the URL, but I don't feel too strongly

@liigo
Copy link
Contributor

liigo commented Jul 28, 2016

struct.Foo.html -> Foo.t.html

These are easier for clients to guess since they only need to know the namespace, not the kind of item.

They still need a guess, I don't think it becomes easier, it just becomes more confusing (struct/fn VS magic t/v).

The ideal way is to make Foo.html as a disambiguation page (like this), if necessary.

@nrc
Copy link
Member Author

nrc commented Jul 28, 2016

They still need a guess, I don't think it becomes easier

It does. The compiler knows the namespace of every name, so any tool which uses the compiler does too. In fact the namespace is part of the language so any tool which wants to correctly interpret a Rust program must know the namespace of a name. Rustdoc's categorisation on the other hand is totally ad hoc - the compiler has no concept of the distinctions, any tool must basically repeat Rustdoc's analysis to work out which category an item belongs too, which means you must have access to the definition of the item, as well as the use (c.f., the namespace which is inferable from the context of the use).

@ollie27
Copy link
Member

ollie27 commented Jul 28, 2016

If this is only for tools then what about one of the following less disruptive alternatives:

  1. Add some form of redirect service so maybe something like /std/index.html?gototype=std::vec::Vec would redirect to the correct page.
  2. Add some form of mapping from path and namespace to url in a file at a fixed location like /std/urls.json. This could then be downloaded by the tool and used to generate links.

@bors
Copy link
Contributor

bors commented Jul 28, 2016

☔ The latest upstream changes (presumably #35086) made this pull request unmergeable. Please resolve the merge conflicts.

@eddyb
Copy link
Member

eddyb commented Jul 28, 2016

Has anyone considered always combining the various namespaces into one page to clean up the URL?
I much prefer doc.rust-lang.org/std/foo/bar#fn to disambiguate. I can easily type those URLs by hand, or read them, and it probably helps with search engines as well.

@GuillaumeGomez
Copy link
Member

GuillaumeGomez commented Jul 28, 2016

@eddyb: That sounds a bit problematic however (but I really love the idea!). It means rewriting the current architecture and increase folders number (after all each '/' is going inside a directory except if you added rewrite rules or wrote the server program).

@nrc
Copy link
Member Author

nrc commented Jul 28, 2016

I think we should continue to improve via @alexcrichton's or @eddyb's proposals, but as a follow-up. This PR reduces the number of options for a name from 10-ish to 3, in the future we can hopefully reduce that to 1, but for now this actually solves painful problems.

@eddyb
Copy link
Member

eddyb commented Jul 28, 2016

@GuillaumeGomez .html would be fine for now, rewrite rules later.

@eddyb
Copy link
Member

eddyb commented Jul 28, 2016

@nrc The danger is that if this gets into stable, it will have permanent effects.

@nrc
Copy link
Member Author

nrc commented Jul 28, 2016

@eddyb nothing that can't be fixed with a redirect, right?

@eddyb
Copy link
Member

eddyb commented Jul 28, 2016

@nrc Sure, it just means that you either keep the redirects there forever or you end up with links from an interim period. At least rewrite rules can handle both status quo -> this PR and this PR -> my proposal.

@nrc
Copy link
Member Author

nrc commented Jul 28, 2016

@ollie27 why do you think this PR is particularly disruptive. Nearly all existing URLs will continue to work, but you get better URLs too. Seems like a win to me.

To be clear, I hate the current URL scheme, I see no reason not to change it other than backwards compatability, which I believe this PR addresses (or will address when the bugs are rolled out).

@ollie27
Copy link
Member

ollie27 commented Jul 29, 2016

It's disruptive because people will need to learn a new scheme. I also don't think Vec.t.html#new.v is much if any better than struct.Vec.html#method.new.

The current scheme is far from perfect but we can't keep changing it and creating redirects. We'll have to support them basically forever and they can create file name conflicts like I've pointed out unless we're very careful. Adding over 2,000 files to the std docs isn't something we should take lightly. If we are going to change the URLs we should do it once and for all.

Do either of my suggestions satisfy your use case?

@nrc
Copy link
Member Author

nrc commented Aug 1, 2016

Too subjective

Also, the new URLs reflect naming as it actually is in Rust, this is how the language is defined (or would be if we had a proper definition) and how the compiler classifies names. The current URLs are totally arbitrary, just some classification that only Rustdoc knows about and isn't defined anywhere, isn't documented, and isn't used by anything else.

// page. If a page exists already, do not overwrite it. We
// don't want to replace a real rustdoc page with a redirect.
// In particular, the redirect for a primitive module would
// overwrite the doc page for the primitive itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these redirects don't necessarily exist, is there much point in them existing at all?

The problem with primitive types is not only with module redirects. The following documents fine currently but doesn't with this PR.

pub struct bool;
#[doc(primitive = "bool")]
mod _bool {}

I think it might make sense to put primitives in their own "namespace" because they're special anyway so presumably can't be treated like other types anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these redirects don't necessarily exist, is there much point in them existing at all?

I would have thought that 2% of links breaking is better than 100% of links breaking.

The following documents fine currently but doesn't with this PR.

This seems to be a totally hypothetical question - is there any reason for anyone to actually write this?

I think it might make sense to put primitives in their own "namespace" because they're special

I thought about this, but it kind of ruins the whole idea - you can't just look up a type, you need to know if that type is a primitive or not - how do you know that? There is no way to tell, you need to look at a magic list that will go out of date as soon as we add i128 or whatever.

Furthermore, they're not special, they're only special to Rustdoc, which has an ugly hack to support documenting them, that should be an implementation detail, not exposed in the URLs.

@ollie27
Copy link
Member

ollie27 commented Aug 2, 2016

People still have to synthesise the URLs when writing documentation. Neither of my suggestions were to not change the URLs, my point was that we don't need to change the canonical URLs if the only reason is so that tools can generate links. In fact both of the suggestions make it easier to change them later on without breaking tools perhaps to one of the nicer looking suggestions above. The old URLs have to be kept around for backwards compatibility anyway so the maintenance burden of them is exactly the same as my suggestions.

I think a change like this deserves a proper RFC. It's essentially adding a new public API which will need to be supported. We'll also need a document outlining the new scheme in detail which is required if it's supposed to be synthesisable.

A problem with the current URLs is that the same page can have multiple methods with the same name on the same page. Currently they get different ids e.g. primitive.bool.html#method.fmt and primitive.bool.html#method.fmt-1 but these can't be guessed nor are they stable (if you change their order in the source code the ids change). This change actually makes the problem worse by adding struct fields and enum variants to the same namespace so now for example std::ops::Range has two #start.vs which need to be disambiguated somehow.

Other issues with the current PR:

  • Macro pages are now missing the "!" in the page title. It's now std::println and not std::println!.
  • The search-index.js file now contains the full URLs which seems unnecessary and increases the size from about 689 KiB to about 1080 KiB for the std docs. This goes against rustdoc's search-index.js file is huge for large projects #31387.
  • Some rustdoc tests still haven't been updated. While they pass they're not testing the same things any more unless they use the new URLs. For example test/rustdoc/issue-34473.rs.
  • The old fragment links don't highlight the item they link to.

I don't know if you're aware of it but #34023 is an issue that I think should be fixed with a new URL scheme.

Also cc. #25226

@steveklabnik
Copy link
Member

/cc @rust-lang/tools

@brson
Copy link
Contributor

brson commented Aug 2, 2016

I am sympathetic to the notion that the current scheme is easier for humans to understand than the proposed. People think about structs and enums, not the type and value namespaces. I'd imagine that a large portion of Rust programmers don't even know what a namespace is in Rust.

Some of the other critiques from @ollie27 I'm not so up to speed on, but I appreciate that they are thinking hard about this.

@nrc
Copy link
Member Author

nrc commented Aug 2, 2016

Macro pages are now missing the "!" in the page title. It's now std::println and not std::println!

This was unintentional, but I liked the change so I left it. The ! is not part of the macro name, it is part of the usage syntax - you don't write macro_rules! foo! you write macro_rules! foo, likewise with macro_use.

Thanks for the other points, I'll look at getting them fixed.

@alexcrichton
Copy link
Member

@nrc what do you think of taking a perhaps more conservative route for now along the lines of what @ollie27 mentioned where there's a page which has JS to hyperlink everywhere else. That way we could leave URLs as they are today to perhaps have an RFC to get improved, but tools could immediately benefit by having much more machine-friendly generated links?

@nrc
Copy link
Member Author

nrc commented Aug 2, 2016

@alexcrichton I think this is kind of a bad idea - we're adding a redirect and thus slowing things down for users, whilst preserving a crappy URL scheme for eternity (and adding a bunch of annoying code, which probably won't be kept up to date, rather than improving existing code).

The current URLs are bad - they are exposing Rustdoc's internals in a misguided attempt at being easy to use, rather than doing things properly. We're locking down implementation decisions for Rustdoc with them since they tie us to current architectural decisions which should be purely implementation details rather than exposed in the interface.

Writing an RFC for this seems like overkill, it will get bogged down in endless bike-shedding and we'll probably end up with something that no-one will ever implement.

@alexcrichton
Copy link
Member

Hm I don't think we can objectively say that the current url scheme is "bad" because from an end-user perspective I agree with @ollie27 and @brson that struct.Foo.html is more descriptive than Foo.t.html. I also agree with @brson that I barely know about namespaces so if I were to generate a link I wouldn't actually know whether a struct/trait is in the value or type namespace.

I also don't think we need to preserve the current scheme for eternity, we just need a transition plan no matter what. I'd imagine that regardless of what we do we'll end up removing this scheme after a few months or a year.

The current URLs I agree though are not great for machines to generate links to, but @ollie27's suggestion should solve that problem (easy to generate links), and we're not really worried about browsing speed when looking at these docs, it's not like you're downloading megabytes of JS to render a page.

It seems like this is contentious enough that we at least want to have a broader discussion about what rustdoc URLs ideally should be. @ollie27 has a good point that changing it twice may be a good point, so if we're gonna change it we may as well choose a scheme that we all agree on at the current point in time. In the meantime though we can fix the machine-generated link problem in a relatively non-intrusive fashion. I agree though that an RFC probably isn't worth it, maybe just a discuss thread to get feedback.

@nrc
Copy link
Member Author

nrc commented Aug 2, 2016

The trouble with the more conservative route is that it is easy to translate from the old URLs to the new ones, but hard to translate the new ones to the old, because there is missing context. To do so, the redirect page has to download a mapping from every single item (including nested items like fields and methods) to rustdoc type. That would be a huge file and thus a slow redirect (and of course it changes, so the client can't cache it, at least not easily). Forcing tools to download that mapping is a massive headache (How often should they do it? What if they don't have internet access? How to manage the memory?) I don't want to do that for IDEs, and no-one else will do it for any other tool, so you are pretty much guaranteeing that no tool will ever link to Rustdoc.

@nrc
Copy link
Member Author

nrc commented Aug 2, 2016

I guess I can just add redirects for every one of the new URLs

nrc added a commit to nrc/rust that referenced this pull request Aug 3, 2016
@nrc nrc mentioned this pull request Aug 3, 2016
@nrc
Copy link
Member Author

nrc commented Aug 3, 2016

Closing in favour of #35236 (sadly).

@nrc nrc closed this Aug 3, 2016
nrc added a commit to nrc/rust that referenced this pull request Aug 3, 2016
nrc added a commit to nrc/rust that referenced this pull request Aug 17, 2016
nrc added a commit to nrc/rust that referenced this pull request Aug 17, 2016
bors added a commit that referenced this pull request Aug 17, 2016
rustdoc: redirect URLs

cc #35020 which does this properly

r? @alexcrichton
sophiajt pushed a commit to sophiajt/rust that referenced this pull request Aug 20, 2016
rustdoc: remove the `!` from macro URLs and titles

Because the `!` is part of a macro use, not the macro's name. E.g., you write `macro_rules! foo` not `macro_rules! foo!`, also `#[macro_import(foo)]`.

(Pulled out of rust-lang#35020).
sophiajt pushed a commit to sophiajt/rust that referenced this pull request Aug 20, 2016
rustdoc: remove the `!` from macro URLs and titles

Because the `!` is part of a macro use, not the macro's name. E.g., you write `macro_rules! foo` not `macro_rules! foo!`, also `#[macro_import(foo)]`.

(Pulled out of rust-lang#35020).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants