diff --git a/text/2603-symbol-name-mangling-v2.md b/text/2603-symbol-name-mangling-v2.md
new file mode 100644
index 00000000000..e082ced2adf
--- /dev/null
+++ b/text/2603-symbol-name-mangling-v2.md
@@ -0,0 +1,1156 @@
+- Feature Name: `symbol_name_mangling_v2`
+- Start Date: 2018-11-27
+- RFC PR: [rust-lang/rfcs#2603](https://github.com/rust-lang/rfcs/pull/2603)
+- Rust Issue: [rust-lang/rust#60705](https://github.com/rust-lang/rust/issues/60705)
+
+# Summary
+[summary]: #summary
+
+This RFC proposes a new mangling scheme that describes what the symbol
+names generated by the Rust compiler look like. This new scheme has a number of
+advantages over the existing one which has grown over time without a
+clear direction. The new scheme is consistent, depends less on
+compiler internals, and the information it stores in symbol names can
+be decoded again which provides an improved experience for users of
+external tools that work with Rust symbol names.
+
+Note that, at this point, the new mangling scheme would not be part of
+the language specification or the specification of a stable Rust ABI.
+In the future it _could_ be part of both and it is designed to be
+stable and extensible; but for the time being it would still be an
+implementation detail of the Rust compiler.
+
+# Motivation
+[motivation]: #motivation
+
+Due to its ad-hoc nature, the compiler's current name mangling scheme
+has a number of drawbacks:
+
+- Information about generic parameters and other things is lost in the
+  mangling process. One cannot extract the type arguments of a
+  monomorphized function from its symbol name.
+
+- The current scheme is inconsistent: most paths use
+  [Itanium ABI][itanium-mangling] style encoding, but some don't.
+
+- The symbol names it generates can contain `.` characters which is
+  not generally supported on all platforms. \[[1][gas]\]
+  \[[2][lld-windows-bug]\] \[[3][thin-lto-bug]\]
+
+- It depends on compiler internals and its results cannot be replicated
+  by another compiler implementation or external tool.
+
+[gas]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Symbol-Names
+[lld-windows-bug]: https://github.com/rust-lang/rust/issues/54190
+[thin-lto-bug]: https://github.com/rust-lang/rust/issues/53912
+
+The proposed scheme solves these problems:
+
+- It encodes information about generic parameters in a reversible way.
+- It has a consistent definition that does not rely on pretty-printing
+  certain language constructs.
+- It generates symbols that only consist of the characters `A-Z`, `a-z`,
+  `0-9`, and `_`.
+- While the proposed scheme still contains things that are implementation
+  defined it has a clearer path towards full name predictability in future.
+
+These properties should make it easier for third party tools to work
+with Rust binaries.
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+The following section will lay out the requirements for a name mangling
+scheme and then introduce the actual scheme through a series of ever
+more complex examples.
+
+## Requirements for a Symbol Mangling Scheme
+
+A symbol mangling scheme has a few goals, one of them essential,
+the rest of them desirable. The essential one is:
+
+- The scheme must provide an unambiguous string encoding for
+  everything that can end up in a binary's symbol table.
+
+"Unambiguous" means that no two distinct compiler-generated entities
+(that is, mostly object code for functions) must be mapped to the same
+symbol name. This disambiguation is the main purpose of the hash-suffix
+in the current, legacy mangling scheme. The scheme proposed here, on
+the other hand, achieves it in a way that allows to also satisfy a
+number of additional desirable properties of a mangling scheme:
+
+ - A mangled symbol should be *decodable* to some degree. That is, it
+   is desirable to be able to tell which exact concrete instance of e.g.
+   a polymorphic function a given symbol identifies. This is true for
+   external tools, backtraces, or just people only having the binary
+   representation of some piece of code available to them. With the
+   current scheme, this kind of information gets lost in the magical
+   hash-suffix.
+
+ - A mangling scheme should be platform-independent. This is mainly
+   achieved by restricting the character set to `A-Z`, `a-z`, `0-9`,
+   `_`. All other characters might have special meaning in some
+   context (e.g. `.` for MSVC `DEF` files) or are simply not
+   supported (e.g. Unicode).
+
+ - The scheme should be efficient, meaning that the symbols it
+   produces are not unnecessarily long (because that takes up space
+   in object files and means more work for the compiler and the linker).
+   In addition, generating or demangling a symbol name should not be
+   too computationally expensive.
+
+ - When used as part of a stable ABI, it should be possible to predict
+   the symbol name for a given source-level construct. For example,
+   given the definition `fn foo<T>() { ... }`, the scheme should allow
+   to construct, by hand, the symbol names for e.g. `foo<u32>` or
+   `foo<extern fn(i32, &mut SomeStruct<(char, &str)>, ...) -> !>()`.
+   Since the current scheme generates its hash from the values of
+   various compiler internal data structures, an alternative compiler
+   implementation could not predict the symbol name, even for
+   simple cases. Note that the scheme proposed here does not fulfill
+   this requirement either (yet) as some things are still left to
+   the compiler implementation.
+
+The RFC also has a couple of non-goals:
+
+ - The mangling scheme does not try to be compatible with an existing
+   (e.g. C++) mangling scheme. While it might sound tempting to encode Rust
+   symbols with an existing scheme, it is the author's opinion that
+   the actual benefits are small (C++ tools would not demangle to Rust
+   syntax, demanglings would be hard to read) and at the same time
+   supporting a Rust-specific scheme in existing tools seems quite
+   feasible (many tools like GDB, LLDB, binutils, and valgrind already
+   have specialized code paths for Rust symbols).
+
+ - The RFC does not try to define a standardized _demangled_ form for
+   symbol names. It defines the mangled form and makes sure it can be
+   demangled in an efficient manner but different demanglers still
+   have some degree of freedom regarding how symbol names are presented
+   to the user.
+
+
+## The Mangling Scheme by Example
+
+This section will develop an overview of the mangling scheme by walking
+through a number of examples. We'll start with the simplest case -- and
+will see how that already involves things that might be surprising.
+
+### Free-standing Functions and Statics
+
+A free-standing function is fully identified via its absolute path.
+For example, the following function
+
+```rust
+mod foo {
+  fn bar() {}
+}
+```
+
+has the path `foo::bar` and `NN3foo3bar` is a possible mangling of that path
+that complies to the character set we are restricted to. Why this format with
+numbers embedded in it? It is a run-length encoding, similar to what the
+[Itanium C++ ABI][itanium-mangling] name mangling scheme uses for
+identifiers. The scheme proposed here will also use this
+format because it does not need termination tokens for identifiers
+(which are hard to come by with our limited character set).
+
+Note that each component in the path (i.e. `foo` and `bar`) also has an
+accompanying _start-tag_ (here `N`) at the beginning. This start-tag is
+needed in order for the syntax to be able to represent complex, nested
+structures as we will see later.
+
+The symbol name above, unfortunately, does not unambiguously identify the
+function in every context. It is perfectly valid for another crate
+to also define `mod foo { fn bar() {} }` somewhere. So in order to
+avoid conflicts in such cases, the absolute path must always include
+the crate-id, as in `NNC7mycrate3foo3bar`. The crate-id has a `C`
+start-tag.
+
+There is another possible ambiguity that we have to take care of.
+Rust has two distinct namespaces: the type and the value namespace.
+This leads to a path of the form `crate_id::foo::bar` not uniquely
+identifying the item `bar` because the following snippet is legal
+Rust code:
+
+```rust
+fn foo() {
+    fn bar() {}
+}
+
+mod foo {
+    fn bar() {}
+}
+```
+
+The function `foo` lives in the value namespace while the module `foo`
+lives in the type namespace. They don't interfere. In order to make the
+symbol names for the two distinct `bar` functions unique, we thus add a
+namespace identifier to the start-tag of components where necessary, as in
+`NvNvC7mycrate3foo3bar` for the first case and `NvNtC7mycrate3foo3bar`
+second case (notice the difference: `NvNv...` vs `NvNt...`).
+
+There is one final case of name ambiguity that we have to take care of.
+Because of macro hygiene, multiple items with the same name can appear in
+the same context. The compiler internally disambiguates such names by
+augmenting them with a numeric index. For example, the first occurrence
+of the name `foo` within its parent is actually treated as `foo'0`, the
+second occurrence would be `foo'1`, the next `foo'2`, and so one. The
+mangling scheme will adopt this setup by prepending a disambiguation
+prefix to each identifier with a non-zero index. So if macro expansion
+would result in the following code:
+
+```rust
+mod foo {
+    fn bar'0() {}
+    // The second `bar` function was introduced by macro expansion.
+    fn bar'1() {}
+}
+```
+
+then we would encode the two functions symbols as `NvNtC7mycrate3foo3bar`
+and `NvNtC7mycrate3foos_3bar` respectively (note the `s_` prefix in the
+second case). A very similar disambiguation is needed for avoiding
+conflicts between crates of the same name but different versions. The
+same syntactic prefix is thus used for crate-id where we encode the
+crate disambiguator as in `NtNvCs1234_7mycrate3foo3bar`. Details on
+the shape of this prefix are provided in the reference-level description.
+
+As opposed to C++ and other languages that support function overloading,
+we don't need to include function parameter types in the symbol name.
+Rust does not allow two functions of the same name but different arguments.
+
+The final symbol name for the function would also include the prefix
+`_R` that is common to all symbol names generated by this scheme:
+
+```
+  _RNvNtCs1234_7mycrate3foo3bar
+  <>^^^^^<----><------><--><-->
+   ||||||   |      |     |   |
+   ||||||   |      |     |   +--- "bar" identifier
+   ||||||   |      |     +------- "foo" identifier
+   ||||||   |      +------------- "mycrate" identifier
+   ||||||   +-------------------- disambiguator for "mycrate"
+   |||||+------------------------ start-tag for "mycrate"
+   ||||+------------------------- namespace tag for "foo"
+   |||+-------------------------- start-tag for "foo"
+   ||+--------------------------- namespace tag for "bar"
+   |+---------------------------- start-tag for "bar"
+   +----------------------------- common Rust symbol prefix
+```
+
+
+### Generic Functions
+
+Each monomorphization of a generic function has its own symbol name.
+The monomorphizations are disambiguated by the list of concrete generic
+arguments. These arguments are added to the symbol name by a pair of `I`
+start-tag at the beginning and a list of the actual arguments at the end.
+So the instance
+
+```rust
+std::mem::align_of::<f64>
+```
+
+would be mangled to
+
+```
+_RINvNtC3std3mem8align_ofdE
+  ^                      ^^
+  |                      ||
+  |                      |+--- end of argument list
+  |                      +--- f64
+  +--- start-tag
+```
+
+where `I` precedes the thing the arguments belong to, `d` designates `f64`
+and `E` ends the argument list. As we can see, we need to be able to
+represent all kinds of types that can be part of such an argument list.
+(In the future, when const generics get added to the language, we will
+also need to represent *values*) These kinds of types are:
+
+ - basic types (`char`, `()`, `str`, `!`, `i8`, `i16`, ...)
+ - reference and pointers types, shared, `mut` and `const`
+ - tuples
+ - arrays, with and without fixed size (e.g. `[u8]`, `[u8; 17]`)
+ - structs, enums, closures, and other named types, possibly with their
+   own set of type arguments
+ - function types such as `fn(&i32) -> u16`
+ - `dyn` traits
+
+Basic types are all encoded via a single lower-case letter, like in the
+Itanium scheme. Named types are encoded as their absolute path
+(including arguments) like is done for function symbols. Composites like
+references, tuples, and function types all follow a simple grammar given
+in the reference-level explanation below. Here are some example manglings
+to get a general feel of what they look like:
+
+ - `std::mem::align_of::<usize>`: `_RINvNtC3std3mem8align_ofjE`
+ - `std::mem::align_of::<&char>`: `_RINvNtC3std3mem8align_ofRcE`
+ - `std::mem::align_of::<std::mem::Discriminant>`:
+   `_RINvNtC3std3mem8align_ofNtNtC3std3mem12DiscriminantE`
+ - `std::mem::align_of::<&mut (&str,())>`: `_RINvNtC3std3mem8align_ofQTReuEE`
+
+There's one more thing we have to take into account for generic functions:
+The compiler may produce "crate-local" copies of a monomorphization.
+That is, if there is a function `foo<T>` which gets used as `foo<u32>`
+in two different crates, the compiler (depending on the optimization level)
+might generate two distinct functions at the LLVM IR level, each with it's
+own symbol. In order to support this without running into conflicts, symbol
+names for monomorphizations must include the id of the crate they are
+instantiated for. This scheme does this by appending an `<crate-id>` suffix
+to the symbol. So for example the mangling for `std::mem::align_of::<usize>`
+would actually look like this:
+
+```
+_RINvNtC3std3mem8align_ofjEC3foo (for crate "foo")
+_RINvNtC3std3mem8align_ofjEC3bar (for crate "bar")
+```
+
+
+### Closures and Closure Environments
+
+The scheme needs to be able to generate symbol names for the function
+containing the code of a closure and it needs to be able to refer to
+the type of a closure if it occurs as a type argument. As closures
+don't have a name, we need to generate one. The scheme proposes to
+use the namespace and disambiguation mechanisms already introduced
+above for this purpose. Closures get their own "namespace" (i.e.
+they are neither in the type nor the value namespace), and each closure
+has an empty name with a disambiguation index (like for macro hygiene)
+identifying them within their parent. The full name of a closure is
+then constructed like for any other named item:
+
+```rust
+mod foo {
+  fn bar(x: u32) {
+    let a = |x| { x + 1 }; // local name: NC<...>0
+    let b = |x| { x + 2 }; // local name: NC<...>s_0
+
+    a(b(x))
+  }
+}
+
+```
+
+In the above example we have two closures, the one assigned to `a`
+and the one assigned to `b`. The first one would get the local name
+`NC<...>0` and the second one the name `NC<...>s_0`. The `0` signifies
+the length of their (empty) name. The `<...>` part is the path of the
+parent. The `C` is the namespace tag, analogous to the `v` tag for
+the value namespace. The `s_` for the second closure is the
+disambiguation index (index `0` is, again, encoded by not prepending
+a prefix). Their full names would then be `NCNvNtC7mycrate3foo3bar0`
+and `NCNvNtC7mycrate3foo3bars_0` respectively.
+
+
+### Methods
+
+Methods are nested within `impl` or `trait` items. As such it would be
+possible to construct their symbol names as paths like
+`my_crate::foo::{{impl}}::some_method` where `{{impl}}` somehow identifies
+the the `impl` in question. Since `impl`s don't have names, we'd have to
+use an indexing scheme like the one used for closures (and indeed, this is
+what the compiler does internally). Adding in generic arguments to, this
+would lead to symbol names looking like
+`my_crate::foo::impl'17::<u32, char>::some_method`.
+
+However, in the opinion of the author these symbols are very hard to map
+back to the method they represent. Consider a module containing dozens of
+types, each with multiple `impl` blocks generated via `#[derive(...)]`.
+In order to find out which method a symbol maps to, one would have to count
+the number of handwritten _and_ macro generated `impl` blocks in the module,
+and hope that one correctly guessed the number of `impl` blocks introduced
+by the given derive-macro (each macro invocation can introduce `0..n` such
+blocks). The name of the method might give a hint, but there are still
+likely to be dozens of methods named `clone`, `hash`, `eq`, et cetera.
+
+The RFC therefore proposes to keep symbol names close to how methods are
+represented in error messages, that is:
+
+- `<Foo<u32, char>>::some_method` for inherent methods, and
+- `<Foo<u32, char> as SomeTrait<Quux>>::some_method` for trait methods.
+
+This can be achieved by extending the definition of paths that we have
+used so far. Instead of the path root always being a crate-id, we now
+also allow a path to start with an `impl` production that contains the
+self-type and (for trait methods) the name of the trait being implemented.
+
+Thus, this extended form of paths would have the following syntax:
+
+```
+<path> = C <identifier>                      // crate-id root
+       | M <type>                            // inherent impl root
+       | X <type> <path>                     // trait impl root
+       | N <namespace> <path> <identifier>   // nested path
+       | I <path> {<generic-arg>} E          // generic arguments
+```
+
+Here are some examples for complete symbol names:
+
+```
+<mycrate::Foo<u32>>::foo => _RNvMINtC7mycrate3FoomE3foo
+<u32 as mycrate::Foo>::foo => _RNvXmNtC7mycrate3Foo3foo
+<mycrate::Foo<u32> as mycrate::Bar<u64>>::foo => _RNvXINtC7mycrate3FoomEINtC7mycrate3BaryE3foo
+```
+
+
+### Items Within Generic Impls
+
+In Rust one can define items within generic items, e.g. functions or
+impls, like in the following example:
+
+```rust
+struct Foo<T>(T);
+
+impl<T> From<T> for Foo<T> {
+  fn from(x: T) -> Self {
+    static MSG: &str = "...";
+    panic!("{}", MSG)
+  }
+}
+```
+
+The `MSG` here (or any other such nested definition) does not inherit
+the generic context from the `impl`. `MSG` is non-generic, and a
+function defined in its place would be too. The fully qualified name
+of `MSG`, according to our examples so far, is thus
+`<mycrate::Foo<_> as std::convert::From<_>>::from::MSG` and its symbol name:
+
+```
+_RNvNvXINtC7mycrate3FoopEINtNtC3std7convert4FrompE4from3MSG
+```
+
+However, with trait specialization, this symbol can be ambiguous.
+Consider the following piece of code:
+
+```rust
+struct Foo<T>(T);
+
+impl<T> From<T> for Foo<T> {
+  default fn from(x: T) -> Self {
+    static MSG: &str = "...";
+    panic!("{}", MSG)
+  }
+}
+
+impl<T: Default> From<T> for Foo<T> {
+  fn from(x: T) -> Self {
+    static MSG: &str = "123";
+    panic!("{}", MSG)
+  }
+}
+```
+
+Notice that both `MSG` statics have the path `<Foo<_> as From<_>>::foo::MSG`.
+We somehow have to disambiguate the `impls`. We do so by adding the path of
+the `impl` to the symbol name.
+
+```
+<path> = C <identifier>                      // crate-id root
+       | M <impl-path> <type>                // inherent impl root
+       | X <impl-path> <type> <path>         // trait impl root
+       | N <namespace> <path> <identifier>   // nested path
+       | I <path> {<generic-arg>} E          // generic arguments
+
+<impl-path> = [<disambiguator>] <path>
+```
+
+The two symbol names would then look something like:
+
+```
+_RNvNvXs2_C7mycrateINtC7mycrate3FoopEINtNtC3std7convert4FrompE4from3MSG
+_RNvNvXs3_C7mycrateINtC7mycrate3FoopEINtNtC3std7convert4FrompE4from3MSG
+       <----------><----------------><----------------------->
+        impl-path      self-type            trait-name
+```
+
+Like other disambiguation information, this path would usually not actually
+be shown by demanglers.
+
+### Unicode Identifiers
+
+Rust allows Unicode identifiers but our character set is restricted
+to ASCII alphanumerics, and `_`. In order to transcode the former to
+the latter, we use the same approach as Swift, which is: encode all
+non-ASCII identifiers via [Punycode][punycode], a standardized and
+efficient encoding that keeps encoded strings in a rather
+human-readable format. So for example, the string
+
+```
+"Gödel, Escher, Bach"
+```
+
+is encoded as
+
+```
+"Gdel, Escher, Bach-d3b"
+```
+
+which, as opposed to something like _Base64_, still gives a pretty
+good idea of what the original string looked like.
+
+Each component of a name, i.e. anything that starts with the number
+of bytes to read in the examples above, is encoded individually.
+Components encoded this way are augmented with a `u` prefix so that
+demanglers know that the identifier needs further decoding. As an
+example, the function:
+
+```rust
+mod gödel {
+  mod escher {
+    fn bach() {}
+  }
+}
+```
+
+would be mangled as:
+
+```
+_RNvNtNtC7mycrateu8gdel_Fqa6escher4bach
+                 <-------->
+              Unicode component
+```
+
+### Compression/Substitution
+
+The length of symbol names has an influence on how much work the compiler,
+linker, and loader have to perform. The shorter the names, the better.
+At the same time, Rust's generics can lead to rather long names (which
+are often not visible in the code because of type inference and
+`impl Trait`). For example, the return type of the following function:
+
+```rust
+fn quux(s: Vec<u32>) -> impl Iterator<Item = (u32, usize)> {
+    s.into_iter()
+     .map(|x| x + 1)
+     .filter(|&x| x > 10)
+     .zip(0..)
+     .chain(iter::once((0, 0)))
+}
+```
+
+is
+
+```rust
+std::iter::Chain<
+  std::iter::Zip<
+    std::iter::Filter<
+      std::iter::Map<
+        std::vec::IntoIter<u32>,
+        [closure@src/main.rs:16:11: 16:18]>,
+      [closure@src/main.rs:17:14: 17:25]>,
+    std::ops::RangeFrom<usize>>,
+  std::iter::Once<(u32, usize)>>
+```
+
+It would make for a long symbol name if this type is used (maybe
+repeatedly) as a generic argument somewhere. C++ has the same problem
+with its templates; which is why the Itanium mangling introduces the
+concept of compression. If a component of a definition occurs more than
+once, it will not be repeated and instead be emitted as a substitution
+marker that allows to reconstruct which component it refers to. The
+scheme proposed here will use the same approach (but with a simpler
+definition).
+
+The exact scheme will be described in detail in the reference level
+explanation below but it roughly works as follows: As a mangled symbol
+name is being built, we remember the position of every substitutable item
+in the output string, that is, we keep track of things a subsequent
+occurrence of which could be replaced by a back reference.
+
+The things that are eligible for substitution are (1) all prefixes of
+paths (including the entire path itself), (2) all types except for
+basic types, and (3) type-level constants (array lengths and values
+passed to const generic params).
+
+Here's an example in order to illustrate the concept. The name
+
+```
+std::iter::Chain<std::iter::Zip<std::vec::IntoIter<u32>, std::vec::IntoIter<u32>>>
+```
+
+is mangled to the following uncompressed string. The lines below show parts
+of the mangled string that already occurred before and can thus be replaced
+by a back reference. The number of at the beginning of each span given
+the 0-based byte position of where it occurred the first time.
+
+```
+  0         10        20        30        40        50        60        70        80        90
+_RINtNtC3std4iter5ChainINtNtC3std4iter3ZipINtNtC3std3vec8IntoItermEINtNtC3std3vec8IntoItermEEE
+                            5----              5----                    5----
+                          3-----------                                43---------
+                                                                    41--------------------
+                                                                   40-----------------------
+```
+
+The compiler is always supposed to use the longest replacement possible
+in order to achieve the best compression. The compressed symbol looks
+as follows:
+
+```
+_RINtNtC3std4iter5ChainINtB2_3ZipINtNtB4_3vec8IntoItermEBt_EE
+                          ^^^         ^^^               ^^^     back references
+```
+
+Back references have the form `B<base-62-number>_`.
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+The reference-level explanation consists of three parts:
+
+1. A specification of the syntax mangled names conform to.
+2. A specification of the compression scheme.
+3. A mapping of Rust entities to the mangling syntax.
+
+For implementing a demangler, only the first two sections are of
+interest, that is, a demangler only needs to understand syntax and
+compression of names, but it does not have to care about how the
+compiler generates mangled names.
+
+
+## Syntax Of Mangled Names
+
+The syntax of mangled names is given in extended Backus-Naur form:
+
+ - Non-terminals are within angle brackets (as in `<name-prefix>`)
+ - Terminals are within quotes (as in `"_R"`),
+ - Optional parts are in brackets (as in `[<decimal>]`),
+ - Repetition (zero or more times) is signified by curly braces (as in `{<name-prefix>}`)
+ - Comments are marked with `//`.
+
+Mangled names conform to the following grammar:
+
+```
+// The <decimal-number> specifies the encoding version.
+<symbol-name> = "_R" [<decimal-number>] <path> [<instantiating-crate>]
+
+<path> = "C" <identifier>               // crate root
+       | "M" <impl-path> <type>         // <T> (inherent impl)
+       | "X" <impl-path> <type> <path>  // <T as Trait> (trait impl)
+       | "Y" <type> <path>              // <T as Trait> (trait definition)
+       | "N" <ns> <path> <identifier>   // ...::ident (nested path)
+       | "I" <path> {<generic-arg>} "E" // ...<T, U> (generic args)
+       | <backref>
+
+// Path to an impl (without the Self type or the trait).
+// The <path> is the parent, while the <disambiguator> distinguishes
+// between impls in that same parent (e.g. multiple impls in a mod).
+// This exists as a simple way of ensure uniqueness, and demanglers
+// don't need to show it (unless the location of the impl is desired).
+<impl-path> = [<disambiguator>] <path>
+
+// The <decimal-number> is the length of the identifier in bytes.
+// <bytes> is the identifier itself and must not start with a decimal digit.
+// If the "u" is present then <bytes> is Punycode-encoded.
+<identifier> = [<disambiguator>] <undisambiguated-identifier>
+<disambiguator> = "s" <base-62-number>
+<undisambiguated-identifier> = ["u"] <decimal-number> <bytes>
+
+// Namespace of the identifier in a (nested) path.
+// It's an a-zA-Z character, with a-z reserved for implementation-internal
+// disambiguation categories (and demanglers should never show them), while
+// A-Z are used for special namespaces (e.g. closures), which the demangler
+// can show in a special way (e.g. `NC...` as `...::{closure}`), or just
+// default to showing the uppercase character.
+<ns> = "C"      // closure
+     | "S"      // shim
+     | <A-Z>    // other special namespaces
+     | <a-z>    // internal namespaces
+
+<generic-arg> = <lifetime>
+              | <type>
+              | "K" <const> // forward-compat for const generics
+
+// An anonymous (numbered) lifetime, either erased or higher-ranked.
+// Index 0 is always erased (can show as '_, if at all), while indices
+// starting from 1 refer (as de Bruijn indices) to a higher-ranked
+// lifetime bound by one of the enclosing <binder>s.
+<lifetime> = "L" <base-62-number>
+
+// Specify the number of higher-ranked (for<...>) lifetimes to bound.
+// <lifetime> can then later refer to them, with lowest indices for
+// innermost lifetimes, e.g. in `for<'a, 'b> fn(for<'c> fn(...))`,
+// any <lifetime>s in ... (but not inside more binders) will observe
+// the indices 1, 2, and 3 refer to 'c, 'b, and 'a, respectively.
+<binder> = "G" <base-62-number>
+
+<type> = <basic-type>
+       | <path>                      // named type
+       | "A" <type> <const>          // [T; N]
+       | "S" <type>                  // [T]
+       | "T" {<type>} "E"            // (T1, T2, T3, ...)
+       | "R" [<lifetime>] <type>     // &T
+       | "Q" [<lifetime>] <type>     // &mut T
+       | "P" <type>                  // *const T
+       | "O" <type>                  // *mut T
+       | "F" <fn-sig>                // fn(...) -> ...
+       | "D" <dyn-bounds> <lifetime> // dyn Trait<Assoc = X> + Send + 'a
+       | <backref>
+
+<basic-type> = "a"      // i8
+             | "b"      // bool
+             | "c"      // char
+             | "d"      // f64
+             | "e"      // str
+             | "f"      // f32
+             | "h"      // u8
+             | "i"      // isize
+             | "j"      // usize
+             | "l"      // i32
+             | "m"      // u32
+             | "n"      // i128
+             | "o"      // u128
+             | "s"      // i16
+             | "t"      // u16
+             | "u"      // ()
+             | "v"      // ...
+             | "x"      // i64
+             | "y"      // u64
+             | "z"      // !
+             | "p"      // placeholder (e.g. for generic params), shown as _
+
+// If the "U" is present then the function is `unsafe`.
+// The return type is always present, but demanglers can
+// choose to omit the ` -> ()` by special-casing "u".
+<fn-sig> := <binder> ["U"] ["K" <abi>] {<type>} "E" <type>
+
+<abi> = "C"
+      | <undisambiguated-identifier>
+
+<dyn-bounds> = <binder> {<dyn-trait>} "E"
+<dyn-trait> = <path> {<dyn-trait-assoc-binding>}
+<dyn-trait-assoc-binding> = "p" <undisambiguated-identifier> <type>
+<const> = <type> <const-data>
+        | <type> "p" // placeholder (e.g. for polymorphic constants), shown as _: T
+        | <backref>
+
+// The encoding of a constant depends on its type, currently only
+// unsigned integers (mainly usize, for arrays) are supported, and they
+// use their value, in base 16 (0-9a-f), not their memory representation.
+<const-data> = {<hex-digit>} "_"
+
+// <base-62-number> uses 0-9-a-z-A-Z as digits, i.e. 'a' is decimal 10 and
+// 'Z' is decimal 61.
+// "_" with no digits indicates the value 0, while any other value is offset
+// by 1, e.g. "0_" is 1, "Z_" is 62, "10_" is 63, etc.
+<base-62-number> = {<0-9a-zA-Z>} "_"
+
+<backref> = "B" <base-62-number>
+
+// We use <path> here, so that we don't have to add a special rule for
+// compression. In practice, only a crate root is expected.
+<instantiating-crate> = <path>
+```
+
+### Namespace Tags
+
+Namespaces are identified by an implementation defined single character tag
+(the `<ns>` production). Only closures (`C`) and shims (`S`) have a
+specific character assigned to them so that demanglers can reliable
+adjust their output accordingly. Other namespace tags have to be omitted
+or shown verbatim during demangling.
+
+This is a concession to the compiler's current implementation. While the
+language only knows two namespaces (the type and the value namespace), the
+compiler uses many more in some important data structures and disambiguation
+indices are assigned according to these internal data structures. So, in
+order not to force the compiler to waste processing time on re-constructing
+different disambiguation indices, the internal unspecified "namespaces" are
+used. This may change in the future.
+
+### Type-Level Constants
+
+As described above, the grammar encodes constant values via the
+`<const-data> = {<hex-digit>} "_"` production, where `{<hex-digit>}` is
+the numeric value of the constant, not its representation as bytes. Using
+the numeric value is platform independent but does not easily scale to
+non-integer data types.
+
+In the future it is likely that Rust will support complex type-level
+constants (i.e. not just integers). This RFC suggests to develop a
+proper mangling for these as part of the future const-generics work,
+and, for now, only define a mangling for integer values.
+
+### Punycode Identifiers
+
+Punycode generates strings of the form `([[:ascii:]]+-)?[[:alnum:]]+`.
+This is problematic for two reasons:
+
+- Generated strings can contain a `-` character; which is not in the
+  supported character set.
+- Generated strings can start with a digit; which makes them clash
+  with the byte-count prefix of the `<identifier>` production.
+
+For these reasons, vanilla Punycode string are further encoded during mangling:
+
+- The `-` character is simply replaced by a `_` character.
+- The part of the Punycode string that encodes the non-ASCII characters
+  is a base-36 number, using `[a-z0-9]` as its "digits". We want to get
+  rid of the decimal digits in there, so we simply remap `0-9` to `A-J`.
+
+Here are some examples:
+
+| Original        | Punycode        | Punycode + Encoding |
+|-----------------|-----------------|---------------------|
+| føø             | f-5gaa          | f_Fgaa              |
+| α_ω             | _-ylb7e         | __ylbHe             |
+| 铁锈             | n84amf          | nIEamf              |
+| 🤦              | fq9h            | fqJh                |
+| ρυστ            | 2xaedc          | Cxaedc              |
+
+With this post-processing in place the Punycode strings can be treated
+like regular identifiers and need no further special handling.
+
+
+## Compression
+
+Symbol name compression works by substituting parts of the mangled
+name that have already been seen for a back reference. Compression
+is directly built into the mangling algorithm, as shown by the
+following piece of pseudocode:
+
+```rust
+fn mangle(node, output_string, substitution_dictionary) {
+    if let Some(backref) = substitution_dictionary.get(node) {
+        // Emit the backref instead of the node's contents
+        mangle(backref, output_string)
+    } else {
+        // Remember where the current node starts in the output
+        let start_position = output_string.len()
+
+        // Do the actual mangling, including recursive mangling of child nodes
+
+        // Add the current node to the substitution dictionary
+        if node.is_substitutable() {
+            substitution_dictionary.insert(node, start_position)
+        }
+    }
+}
+```
+
+This algorithm automatically chooses the best compression because
+parent nodes (which are always larger) are visited before child
+nodes.
+
+Note that this kind of compression relies on the fact that all
+substitutable AST nodes have a self-terminating mangled form,
+that is, given the start position of the encoded node, the
+grammar guarantees that it is always unambiguous where the
+node ends. This is ensured by not allowing optional or
+repeating elements at the end of substitutable productions.
+
+
+### Decompression
+
+Decompression too is built directly into demangling/parsing. When a back
+reference is encountered, we decode the referenced position and use a
+temporary demangler/parser to do the decoding of the node's actual content:
+
+```rust
+fn demangle_at(&mut pos, mangled, output_string) {
+  if is_backref(*pos, mangled) {
+    // Read the byte offset of the referenced node and
+    // advance `pos` past the backref.
+    let mut referenced_pos = decode(pos, mangled);
+    demangle_at(&mut referenced_pos, mangled, output_string)
+  } else {
+    // do regular demangling
+  }
+}
+```
+
+Using byte offsets as backref keys (as this RFC does) instead of post-order
+traversal indices (as Itanium mangling does) has the advantage that the
+demangler does not need to duplicate the mangler's substitution indexing logic,
+something that can become quite complex (as demonstrated by the compression
+scheme proposed in the initial version of this RFC).
+
+### A Note On Implementing Efficient Demanglers
+
+The mangling syntax is constructed in a way that allows for implementing
+efficient demanglers:
+
+ - Mangled names contain information in the same order as unmangled
+   names are expected to contain it. Therefore, a demangler can directly
+   generate its output while parsing the mangled form. There is no need
+   to explicitly instantiate the AST in memory.
+
+ - The same is true for decompression. Decompression can be done without
+   allocating memory outside of the stack. Alternatively the demangler
+   can keep a simple array that maps back-ref indices to ranges in the
+   already generated output. When it encounters a `<backref>` in need
+   of expansion, it can just look up corresponding range and do a
+   simple `memcpy`.
+
+Parsing, decompression, and demangling can thus be done in a single pass
+over the mangled name without the need for complex data structures, which
+is useful when having to implement `#[no_std]` or C demanglers. (Note that
+Punycode can complicate decoding slightly because it needs dynamic memory
+allocation in the general case but it can be implemented with an on-stack
+buffer for a reasonable maximum supported length).
+
+## Mapping Rust Language Entities to Symbol Names
+
+This RFC suggests the following mapping of Rust entities to mangled names:
+
+- Named functions, methods, and statics shall be represented by a
+  `<path>` production.
+
+- Paths should be rooted at the inner-most entity that can act
+  as a path root. Roots can be crate-ids, inherent impls, trait impls, and
+  (for items within default methods) trait definitions.
+
+- The compiler is free to choose disambiguation indices and namespace tags from
+  the reserved ranges as long as it ascertains identifier unambiguity.
+
+- Generic arguments that are equal to the default should not be encoded in
+  order to save space.
+
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Why should we *not* do this?
+
+- The current/legacy scheme based on symbol-hashes is flexible in that
+  hashes can be changed at will. That is, the unstable part of the
+  current mangling scheme is nicely contained and does not keep breaking
+  external tools. The danger of breakage is greater with the scheme
+  proposed here because it exposes more information.
+
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+The alternatives considered are:
+
+ 1. Keeping the current scheme. It does meet the minimum requirements
+    after all. However, the general consensus seems to be that this
+    leads to situations where people are unpleasantly surprised when
+    they come across (demangled) symbol names in backtraces or profilers.
+
+ 2. Keeping the current scheme but cleaning it up by making the non-hash
+    part more consistent and more expressive. Keep the hash part as a
+    safeguard against symbol conflicts and the rest as something just
+    for demangling. The downside of this is that the hash would still
+    not be predictable, and symbols would get rather long if they should
+    contain more human-readable information about generic arguments.
+
+ 3. Define a standardized pretty-printing format for things that end up
+    as symbols, and then encode that via Punycode in order to meet the
+    character set restrictions. This would be rather simple. Symbol names
+    would remain somewhat human-readable (but not very, because all
+    separators would be stripped out). But without some kind of additional
+    compression, symbol names would become rather long.
+
+ 4. Use the scheme from the previous bullet point but apply the compression
+    scheme described above. We could do this but it wouldn't really be less
+    complex than the scheme proposed by the RFC.
+
+ 5. Define a standardized pretty-printing format for things that end up as
+    symbols, compress with `zstd` (specially trained for Rust symbols) and
+    encode the result as `base63`. This is rather simple but loses all
+    human-readability. It's unclear how well this would compress. It would
+    pull the `zstd` specification into the mangling scheme specification,
+    as well as the pre-trained dictionary.
+
+# Prior art
+[prior-art]: #prior-art
+
+One of the major modern mangling schemes with a public specification is the
+[Itanium C++ ABI][itanium-mangling] scheme for C++ which is used by the GCC
+toolchain. An initial version of this RFC sticked closely to Itanium mangling,
+however, the latest version only retains the run-length encoding for
+identifiers and some literals for tagging things like basic types. The
+Itanium scheme has been criticized for being overly complex, due to its
+extensive grammar and two separate compression schemes.
+
+The idea of using [Punycode][punycode] for handling of unicode identifiers
+is taken from the [Swift][swift-gh] programming language's
+[mangling scheme][swift-mangling].
+
+[punycode]: https://tools.ietf.org/html/rfc3492
+[itanium-mangling]: http://refspecs.linuxbase.org/cxxabi-1.86.html#mangling
+[swift-gh]: https://github.com/apple/swift
+[swift-mangling]: https://github.com/apple/swift/blob/master/docs/ABI/Mangling.rst#identifiers
+
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+### Punycode vs UTF-8
+During the pre-RFC phase, it has been suggested that Unicode identifiers
+should be encoded as UTF-8 instead of Punycode on platforms that allow it.
+GCC, Clang, and MSVC seem to do this. The author of the RFC has a hard
+time making up their mind about this issue. Here are some interesting
+points that might influence the final decision:
+
+- Using UTF-8 instead of Punycode would make mangled strings containing
+  non-ASCII identifiers a bit more human-readable. For demangled strings,
+  there would be no difference.
+
+- Punycode support is non-optional since some platforms only allow a very
+  limited character set for symbol names. Thus, we would be using UTF-8 on
+  some platforms and Punycode on others, making it harder to predict what a
+  symbol name for a given item looks like.
+
+- Punycode encoding and decoding is more runtime effort for the mangler
+  and demangler.
+
+- Once a demangler supports Punycode, it is not much effort to support
+  both encodings. The `u` identifier prefix tells the demangler whether
+  it is Punycode. Otherwise it can just assume UTF-8 which already
+  subsumes ASCII.
+
+**UPDATE**: This RFC recommends that Punycode encoded identifiers must
+be supported by demanglers but that it is up to the compiler implementation
+(for now) to decide whether to use it for a given platform. This question
+will have to be revisited if Rust ever wants to define a stable ABI.
+
+### Encoding parameter types for function symbols
+
+It has been suggested that parameter types for functions and methods should
+be encoded in mangled form too. This is not necessary for symbol name
+uniqueness but it would provide an additional safeguard against silent
+ABI-related errors where definition and callers of some function make
+different assumptions about what parameters a function takes. The RFC
+does not propose to do this because:
+
+ - Rust makes sure this cannot happen via crate metadata,
+ - it would make symbol names longer, and
+ - only some but not all ABI related errors are caught by the safeguard.
+
+However, a final decision on the topic has not been made yet.
+
+**UPDATE**: This RFC suggests that parameter types are *not* encoded into
+function and method symbols. Symbol names will already get significantly
+longer due to encoding additional information and the additional
+safeguard provided against ABI mismatches is less relevant for Rust
+than it is for other languages that don't have a concept of
+library/crate metadata.
+
+
+# Appendix A - Suggested Demangling
+
+This RFC suggests that names are demangled to a form that matches Rust
+syntax as it is used in source code, compiler error messages and `rustdoc`:
+
+ - Path components should be separated by `::`.
+
+ - If the path root is a `<crate-id>` it should be printed as the crate name.
+   If the context requires it for correctness, the crate disambiguator can be
+   printed too, as in, for example, `std[a0b1c2d3]::collections::HashMap`.
+   In this case `a0b1c2d3` would be the disambiguator. Usually, the
+   disambiguator can be omitted for better readability.
+
+ - If the path root is an impl, it should be printed as `<SelfType>` (for
+   inherent impls) or `<SelfType as Trait>` (for trait impls), like the
+   compiler does in error messages. The `<impl-path>` also contained in the
+   AST node should usually be omitted.
+
+ - The list of generic arguments should be demangled as `<T1, T2, T3>`.
+
+ - Identifiers can have a numeric disambiguator
+   (the `<disambiguator>` production). The syntactic version of the numeric
+   disambiguator maps to a numeric index. If the disambiguator is not
+   present, this index is 0. If it is of the form `s_` then the index is 1.
+   If it is of the form `s<base-62-digit>_` then the index is
+   `<base-62-digit> + 2`. The suggested demangling of a disambiguator is
+   `[<index>]`. However, for better readability, these disambiguators
+   should usually be omitted in the demangling altogether. Disambiguators
+   with index zero can _always_ be omitted.
+
+   The exception here are closures. Since these do not have a name, the
+   disambiguator is the only thing identifying them. The suggested
+   demangling for closures is thus `{closure}[<index>]`.
+
+# Appendix B - Examples
+
+We assume that all examples are defined in a crate named `mycrate[1234]`.
+
+
+### Free-standing Item
+
+```rust
+mod foo {
+  mod bar {
+    fn baz() {}
+  }
+}
+```
+- unmangled: `mycrate::foo::bar::baz`
+- mangled: `_RNvNtNtCs1234_7mycrate3foo3bar3baz`
+
+
+### Item Defined In Inherent Method
+
+```rust
+struct Foo<T>(T);
+
+impl<T> Foo<T> {
+  pub fn bar<U>(_: U) {
+    static QUUX: u32 = 0;
+    // ...
+  }
+}
+```
+
+- unmangled: `<mycrate::Foo<_>>::bar::QUUX`
+- mangled: `_RNvNvMCs1234_7mycrateINtCs1234_7mycrate3FoopE3bar4QUUX`
+
+
+### Item Defined In Trait Method
+
+```rust
+struct Foo<T>(T);
+
+impl<T> Clone for Foo<T> {
+  fn clone(&self) -> Self {
+    static QUUX: u32 = 0;
+    // ...
+  }
+}
+```
+- unmangled: `<mycrate::Foo<_> as std::clone::Clone>::clone::QUUX`
+- mangled: `_RNvNvXCs1234_7mycrateINtCs1234_7mycrate3FoopENtNtC3std5clone5Clone5clone4QUUX`
+
+
+### Item Defined In Initializer Of A Static
+
+```rust
+pub static QUUX: u32 = {
+  static FOO: u32 = 1;
+  FOO + FOO
+};
+```
+- unmangled: `mycrate::QUUX::FOO`
+- mangled: `_RNvNvCs1234_7mycrate4QUUX3FOO`
+
+
+### Compressed Prefix Constructed From Prefix That Contains A Substitution Itself - TODO
+- unmangled: `mycrate::foo<mycrate::bar,mycrate::bar::baz>`
+- mangled: `_RINvCs1234_7mycrate3fooNvB4_3barNvBn_3bazE`
+
+
+### Progressive type compression
+- unmangled: `std::foo<(std::Bar,std::Bar),(std::Bar,std::Bar)>`
+- mangled: `_RINxC3std3fooTNyB4_3BarBe_EBd_E`
+
+
+# Appendix C - Change LOG
+- Removed mention of Itanium mangling in introduction.
+- Weakened "predictability" goal.
+- Removed non-goal of not providing a mangling for lifetimes.
+- Added non-goal for not trying to standardize the demangled form.
+- Updated specification and examples to new grammar as proposed by eddyb.
+- `impl` disambiguation strategy changed to using the impl path instead of param bounds.
+- Updated prior art section to not say this RFC is an adaptation of Itanium mangling.
+- Updated compiler's expected assignment of disambiguation indices and namespace tags.
+- Removed "complexity" drawback since the scheme is not very complex anymore.
+- Removed unresolved question "Re-use `<disambiguator>` for crate disambiguator".
+- Added note about default generic arguments to reference-level-explanation.
+- Added note about Punycode making decoding more complicated.
+- Resolve question of complex constant data.
+- Add a recommended resolution for open question around Punycode identifiers.
+- Add a recommended resolution for open question around encoding function parameter types.