From 220e2e728aa9d8c22e7c34bd7fb2a55a0208a714 Mon Sep 17 00:00:00 2001 From: Joshua Chen Date: Sun, 20 Nov 2022 21:33:35 -0500 Subject: [PATCH] Fix --- .../deprecated_and_obsolete_features/index.md | 2 +- .../regular_expressions/backreference/index.md | 2 +- .../regular_expressions/capturing_group/index.md | 2 +- .../regular_expressions/character_class/index.md | 8 ++++---- .../character_class_escape/index.md | 4 ++-- .../{escape_character => character_escape}/index.md | 11 ++++++----- .../regular_expressions/lookahead_assertion/index.md | 10 +++++----- .../regular_expressions/lookbehind_assertion/index.md | 6 +++--- .../regular_expressions/named_backreference/index.md | 2 +- .../named_capturing_group/index.md | 2 +- .../reference/regular_expressions/quantifier/index.md | 6 ++---- .../unicode_character_class_escape/index.md | 4 ++-- .../reference/regular_expressions/wildcard/index.md | 2 +- 13 files changed, 30 insertions(+), 31 deletions(-) rename files/en-us/web/javascript/reference/regular_expressions/{escape_character => character_escape}/index.md (87%) diff --git a/files/en-us/web/javascript/reference/deprecated_and_obsolete_features/index.md b/files/en-us/web/javascript/reference/deprecated_and_obsolete_features/index.md index 8bf16bdc9b81786..2ba507f9a21af6f 100644 --- a/files/en-us/web/javascript/reference/deprecated_and_obsolete_features/index.md +++ b/files/en-us/web/javascript/reference/deprecated_and_obsolete_features/index.md @@ -67,7 +67,7 @@ The following regex syntaxes are deprecated and only available in non-[unicode]( - [Lookahead assertions](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion) are [quantifiable](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Quantifier). - [Backreferences](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Backreference) that do not refer to an existing capturing group become [legacy octal escapes](#escape_sequences). - In [character classes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class), character ranges where one boundary is a character class makes the `-` become a literal character. -- An escape sequence that's not one of the recognized kinds becomes an ["identity escape"](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Escape_character). +- An escape sequence that's not one of the recognized kinds becomes an ["identity escape"](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape). - Escape sequences within [character classes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class) of the form `\cX` where `X` is a number or `_` are decoded in the same way as those with ASCII letters: `\c0` is the same as `\cP` when taken modulo 32. In addition, if the form `\cX` is encountered anywhere where `X` is not one of the recognized characters, then the backslash is treated as a literal character. - The sequence `\k` within a regex that doesn't have any [named capturing groups](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Named_capturing_group) is treated as an identity escape. - The syntax characters `]`, `{`, and `}` may appear literally without escaping if they cannot be interpreted as the end of a character class or quantifier delimiters. diff --git a/files/en-us/web/javascript/reference/regular_expressions/backreference/index.md b/files/en-us/web/javascript/reference/regular_expressions/backreference/index.md index 285f00af65218c2..210fb2bc56b805a 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/backreference/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/backreference/index.md @@ -35,7 +35,7 @@ The backreference must refer to an existent capturing group. If the number it sp /(a)\2/u; // SyntaxError: Invalid regular expression: Invalid escape ``` -In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, invalid backreferences become a [legacy octal escape](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#escape_sequences) sequence. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, invalid backreferences become a [legacy octal escape](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#escape_sequences) sequence. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /(a)\2/.test("a\x02"); // true diff --git a/files/en-us/web/javascript/reference/regular_expressions/capturing_group/index.md b/files/en-us/web/javascript/reference/regular_expressions/capturing_group/index.md index 9aa3ecd3697e8c9..c9d08a82cf728c4 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/capturing_group/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/capturing_group/index.md @@ -49,7 +49,7 @@ Capturing groups can be used in [lookahead](/en-US/docs/Web/JavaScript/Reference ```js /c(?=(ab))/.exec("cab"); // ['', 'ab'] /(?<=(a)(b))c/.exec("abc"); // ['', 'a', 'b'] -/(?<=([ab])+)c/.exec("abc"); // ['', 'a']; because "a" is seen by the lookbehind after it's seen "b" +/(?<=([ab])+)c/.exec("abc"); // ['', 'a']; because "a" is seen by the lookbehind after the lookbehind has seen "b" ``` Capturing groups can be nested, in which case the outer group is numbered first, then the inner group, because they are ordered by their opening parentheses. If a nested group is repeated by a quantifier, then each time the group matches, the subgroups' results are all overwritten, sometimes with `undefined`. diff --git a/files/en-us/web/javascript/reference/regular_expressions/character_class/index.md b/files/en-us/web/javascript/reference/regular_expressions/character_class/index.md index 81d084baa9f7402..a935865fa8ce92a 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/character_class/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/character_class/index.md @@ -21,7 +21,7 @@ A **character class** matches any character in or not in a custom set of charact ## Description -A character class specifies a list of characters between square bracket and matches any character in the list. The following syntaxes are available: +A character class specifies a list of characters between square brackets and matches any character in the list. The following syntaxes are available: - A single character: matches the character itself. - A range of characters: matches any character in the inclusive range. The range is specified by two characters separated by a dash (`-`). The first character must be smaller in character value than the second character. @@ -35,15 +35,15 @@ Unlike other parts of the regex, character classes interpret most character lite - The `]` character indicates the end of the character class. To use it literally, escape it as `\]`. - The dash (`-`) character, when used between two characters, indicates a range. When it appears at the start or end of a character class, it is a literal character. It's also a literal character when it's used in the boundary of a range. For example, `[a-]` matches the characters `a` and `-`, `[!--]` matches the characters `!` to `-`, and `[--9]` matches the characters `-` to `9`. You can also escape it as `\-` if you want to use it literally anywhere. -The [lexical grammar](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#regular_expression_literals) does a very rough parse of regex literals, so that it does not end the regex literal at a `/` character that appears within a character class. This means `/[/]/` is valid without needing to escape the `/`. +The [lexical grammar](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#regular_expression_literals) does a very rough parse of regex literals, so that it does not end the regex literal at a `/` character which appears within a character class. This means `/[/]/` is valid without needing to escape the `/`. -The boundaries of a character range must not specify more than one character, which happens if you use a [character class escape](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class_escape). For example, +The boundaries of a character range must not specify more than one character, which happens if you use a [character class escape](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class_escape). For example: ```js /[\s-9]/u; // SyntaxError: Invalid regular expression: Invalid character class ``` -In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, character ranges where one boundary is a character class makes the `-` become a literal character. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, character ranges where one boundary is a character class makes the `-` become a literal character. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /[\s-9]/.test("-"); // true diff --git a/files/en-us/web/javascript/reference/regular_expressions/character_class_escape/index.md b/files/en-us/web/javascript/reference/regular_expressions/character_class_escape/index.md index 423c7bf16749812..83393288e71d5ef 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/character_class_escape/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/character_class_escape/index.md @@ -22,7 +22,7 @@ Unlike [character escapes](/en-US/docs/Web/JavaScript/Reference/Regular_expressi - `\d` - : Matches any digit character. Equivalent to `[0-9]`. - `\w` - - : Matches any word character. + - : Matches any word character, where a word character includes letters (A–Z, a–z), numbers (0–9), and underscore (_). If the [`u`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) and [`i`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/ignoreCase) flags are both set, it also matches other Unicode characters that get canonicalized to one of the characters above through [case folding](https://unicode.org/Public/UCD/latest/ucd/CaseFolding.txt). - `\s` - : Matches any [whitespace](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#white_space) or [line terminator](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#line_terminators) character. @@ -30,7 +30,7 @@ The uppercase forms `\D`, `\W`, and `\S` negates the match or `\d`, `\w`, and `\ [Unicode character class escapes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Unicode_character_class_escape) start with `\p` and `\P`, but they are only supported in [unicode mode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode). In non-unicode mode, they are [identity escapes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape) for the `p` or `P` character. -Character class escapes can be used in [character classes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class). However, using them as boundaries of character ranges is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +Character class escapes can be used in [character classes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class). However, they cannot be used as boundaries of character ranges. This is only allowed as a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ## See also diff --git a/files/en-us/web/javascript/reference/regular_expressions/escape_character/index.md b/files/en-us/web/javascript/reference/regular_expressions/character_escape/index.md similarity index 87% rename from files/en-us/web/javascript/reference/regular_expressions/escape_character/index.md rename to files/en-us/web/javascript/reference/regular_expressions/character_escape/index.md index 731cbd9cbf4baa0..5fba96da1066eb1 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/escape_character/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/character_escape/index.md @@ -27,10 +27,10 @@ A **character escape** represents a character that may not be able to be conveni ## Description -The following escape characters are recognized in regular expressions: +The following character escapes are recognized in regular expressions: - `\f`, `\n`, `\r`, `\t`, `\v` - - : Same as those in [string literals](/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#escape_sequences), except `\b` represents a [word boundary](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Word_boundary_assertion) in regexes unless in a [character class](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class). + - : Same as those in [string literals](/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#escape_sequences), except `\b`, which represents a [word boundary](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Word_boundary_assertion) in regexes unless in a [character class](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class). - `\c` followed by a letter from `A` to `Z` or `a` to `z` - : Represents the control character with value equal to the letter's character value modulo 32. For example, `\cJ` represents line break (`\n`), because the code point of `J` is 74, and 74 modulo 32 is 10, which is the code point of line break. Because an uppercase letter and its lowercase form differ by 32, `\cJ` and `\cj` are equivalent. You can represent control characters from 1 to 26 in this form. - `\0` @@ -46,15 +46,16 @@ The following escape characters are recognized in regular expressions: In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, escape sequences that are not one of the above become _identity escapes_: they represent the character that follows the backslash. For example, `\a` represents the character `a`. This behavior limits the ability to introduce new escape sequences without causing backward compatibility issues, and is therefore forbidden in unicode mode. -In non-unicode mode, `]`, `{`, and `}` may appear literally if it's not possible to parse them as the end of a character class, or quantifier delimiters. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +In non-unicode mode, `]`, `{`, and `}` may appear literally if it's not possible to parse them as the end of a character class or quantifier delimiters. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. In non-unicode mode, escape sequences within [character classes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class) of the form `\cX` where `X` is a number or `_` are decoded in the same way as those with ASCII letters: `\c0` is the same as `\cP` when taken modulo 32. In addition, if the form `\cX` is encountered anywhere where `X` is not one of the recognized characters, then the backslash is treated as a literal character. These syntaxes are also deprecated. ```js -/\c/.test("\\c"); // true /[\c0]/.test("\x10"); // true /[\c_]/.test("\x1f"); // true -/\c0/.test("\\c0"); // true +/[\c*]/.test("\\"); // true +/\c/.test("\\c"); // true +/\c0/.test("\\c0"); // true (the \c0 syntax is only supported in character classes) ``` ## Examples diff --git a/files/en-us/web/javascript/reference/regular_expressions/lookahead_assertion/index.md b/files/en-us/web/javascript/reference/regular_expressions/lookahead_assertion/index.md index 4a4b477c6556c7e..bd5bf44efd1a8ea 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/lookahead_assertion/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/lookahead_assertion/index.md @@ -5,7 +5,7 @@ slug: Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion {{JsSidebar}} -A **lookahead assertion** "looks ahead": it attempts to match the subsequent input with the given pattern, but it does not consume any of the input — if the match is successful, the current position in the input is not advanced. +A **lookahead assertion** "looks ahead": it attempts to match the subsequent input with the given pattern, but it does not consume any of the input — if the match is successful, the current position in the input stays the same. ## Syntax @@ -23,7 +23,7 @@ A **lookahead assertion** "looks ahead": it attempts to match the subsequent inp A regular expression generally matches from left to right. This is why lookahead and [lookbehind](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookbehind_assertion) assertions are called as such — lookahead asserts what's on the right, and lookbehind asserts what's on the left. -In order for a `(?=pattern)` assertion to succeed, the `pattern` must match at the current position, but the current position is not advanced before matching the subsequent input. The `(?!pattern)` form negates the assertion — it succeeds if the `pattern` does not match at the current position. +In order for a `(?=pattern)` assertion to succeed, the `pattern` must match the text after the current position, but the current position is not changed. The `(?!pattern)` form negates the assertion — it succeeds if the `pattern` does not match at the current position. The `pattern` can contain [capturing groups](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Capturing_group). See the capturing groups page for more information on the behavior in this case. @@ -41,9 +41,9 @@ The matching of the pattern above happens as follows: 3. `\1` does not match the following string, because that requires 2 `"a"`s, but only 1 is available. So the matcher backtracks, but it doesn't go into the lookahead, so the capturing group cannot be reduced to 1 `"a"`, and the entire match fails at this point. 4. `exec()` re-attempts matching at the next position — before the second `"a"`. This time, the lookahead matches `"a"`, and `a*b` matches `"b"`. The backreference `\1` matches the captured `"a"`, and the match succeeds. -If the regex is able to backtrack into the lookahead and revise the choice made in there, then the match would succeed at step 3 by `(a+)` matching the first `"a"` (instead of the first two `"a"`s) and `a*b` matching `"ab"`, without even attempting the next input position. +If the regex is able to backtrack into the lookahead and revise the choice made in there, then the match would succeed at step 3 by `(a+)` matching the first `"a"` (instead of the first two `"a"`s) and `a*b` matching `"ab"`, without even re-attempting the next input position. -Negative lookaheads can contain capturing groups as well, but backreferences only make sense within the `pattern`, because if matching continues, `pattern` would necessarily be unmatched (otherwise the assertion fails). This means outside of the `pattern`, backreferences to capturing groups in negative lookaheads always succeed. For example: +Negative lookaheads can contain capturing groups as well, but backreferences only make sense within the `pattern`, because if matching continues, `pattern` would necessarily be unmatched (otherwise the assertion fails). This means outside of the `pattern`, backreferences to those capturing groups in negative lookaheads always succeed. For example: ```js /(.*?)a(?!(a+)b\1c)\1(.*)/.exec("baaabaac"); // ['baaabaac', 'ba', undefined, 'abaac'] @@ -58,7 +58,7 @@ The matching of the pattern above happens as follows: 5. At this position, the lookahead fails to match, because the remaining input does not follow the pattern "any number of `"a"`s, a `"b"`, the same number of `"a"`s, a `c`". This causes the assertion to succeed. 6. However, because nothing was matched within the assertion, the `\1` backreference has no value, so it matches the empty string. This causes the rest of the input to be consumed by the `(.*)` at the end. -Normally, assertions cannot be [quantified](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Quantifier). However, in non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, lookahead assertions) can be quantified. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +Normally, assertions cannot be [quantified](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Quantifier). However, in non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, lookahead assertions can be quantified. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /(?=a)?b/.test("b"); // true; the lookahead is matched 0 time diff --git a/files/en-us/web/javascript/reference/regular_expressions/lookbehind_assertion/index.md b/files/en-us/web/javascript/reference/regular_expressions/lookbehind_assertion/index.md index b3a61c53099eb50..4ca4f6601c50ca1 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/lookbehind_assertion/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/lookbehind_assertion/index.md @@ -5,7 +5,7 @@ slug: Web/JavaScript/Reference/Regular_expressions/Lookbehind_assertion {{JsSidebar}} -A **lookbehind assertion** "looks behind": it attempts to match the previous input with the given pattern, but it does not consume any of the input — if the match is successful, the current position in the input is not advanced. It matches each atom in its pattern in the reverse order. +A **lookbehind assertion** "looks behind": it attempts to match the previous input with the given pattern, but it does not consume any of the input — if the match is successful, the current position in the input stays the same. It matches each atom in its pattern in the reverse order. ## Syntax @@ -32,11 +32,11 @@ Lookbehind generally has the same semantics as lookahead — however, within a l // Not ['', 'ab', 'c'] ``` -If the lookbehind matches from left to right, it should first greedily match `[ab]+`, which makes the first group capture `"ab"`, and the remaining `"c"` is captured by `[bc]+`. However, because `[bc]+` is matched first, it greedily grabs the `"b"`, leaving only `"a"` for `[ab]+`. +If the lookbehind matches from left to right, it should first greedily match `[ab]+`, which makes the first group capture `"ab"`, and the remaining `"c"` is captured by `[bc]+`. However, because `[bc]+` is matched first, it greedily grabs `"bc"`, leaving only `"a"` for `[ab]+`. This behavior is reasonable — the matcher does not know where to _start_ the match (because the lookbehind may not be fixed-length), but it does know where to _end_ (at the current position). Therefore, it starts from the current position and works backwards. (Regexes in some other languages forbid non-fixed-length lookbehind to avoid this issue.) -For [quantified](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Quantifier) [capturing groups](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Capturing_group) inside the lookbehind, the match furthest to the left — instead of the one on the right — is captured because of backward matching. See the capturing groups page for more information. [Backreferences](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Backreference) inside the lookbehind must appear on the _left_ of the group it's referring to. However, [disjunctions](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Disjunction) are still attempted left-to-right. +For [quantified](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Quantifier) [capturing groups](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Capturing_group) inside the lookbehind, the match furthest to the left of the input string — instead of the one on the right — is captured because of backward matching. See the capturing groups page for more information. [Backreferences](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Backreference) inside the lookbehind must appear on the _left_ of the group it's referring to, also due to backward matching. However, [disjunctions](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Disjunction) are still attempted left-to-right. ## Examples diff --git a/files/en-us/web/javascript/reference/regular_expressions/named_backreference/index.md b/files/en-us/web/javascript/reference/regular_expressions/named_backreference/index.md index 55a55f56a8203b1..99488cd872121ba 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/named_backreference/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/named_backreference/index.md @@ -22,7 +22,7 @@ A **named backreference** refers to the submatch of a previous [named capturing Named backreferences are very similar to normal backreferences: it refers to the text matched by a capturing group and matches the same text. The difference is that you refer to the capturing group by name instead of by number. This makes the regular expression more readable and easier to refactor and maintain. -In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, the sequence `\k` only starts a named backreference if the regex contains at least one named capturing group. Otherwise, it is an [identity escape](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape) and is the same as the literal character `k`. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, the sequence `\k` only starts a named backreference if the regex contains at least one named capturing group. Otherwise, it is an [identity escape](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape) and is the same as the literal character `k`. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /\k/.test("k"); // true diff --git a/files/en-us/web/javascript/reference/regular_expressions/named_capturing_group/index.md b/files/en-us/web/javascript/reference/regular_expressions/named_capturing_group/index.md index 9ead98cc4835314..bc37b914e2c62e7 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/named_capturing_group/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/named_capturing_group/index.md @@ -40,7 +40,7 @@ Named capturing groups will all be present in the result. If a named capturing g /(?ab)|(?cd)/.exec("cd").groups; // [Object: null prototype] { ab: undefined, cd: 'cd' } ``` -You can get the start and end indices of each named capturing group in the input string by using the [`d`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/hasIndices) flag. In addition to accessing them on the `indices` property on the array returned by `exec()`, you can also access them by their names through `indices.groups`. +You can get the start and end indices of each named capturing group in the input string by using the [`d`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/hasIndices) flag. In addition to accessing them on the `indices` property on the array returned by `exec()`, you can also access them by their names on `indices.groups`. Compared to unnamed capturing groups, named capturing groups have the following advantages: diff --git a/files/en-us/web/javascript/reference/regular_expressions/quantifier/index.md b/files/en-us/web/javascript/reference/regular_expressions/quantifier/index.md index 65662cf516702f6..67b29d563b0106a 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/quantifier/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/quantifier/index.md @@ -51,8 +51,6 @@ A quantifier is placed after an [atom](/en-US/docs/Web/JavaScript/Reference/Regu | `{min,}` | `min` | Infinity | | `{min,max}` | `min` | `max` | -The `?`, `{count}`, and `{min,max}` syntaxes all match for finite times, meaning they are equivalent to enumerating all possibilities in a [disjunction](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Disjunction). However, using quantifiers makes the pattern shorter and more readable. - For the `{count}`, `{min,}`, and `{min,max}` syntaxes, there cannot be white spaces around the numbers — otherwise, it becomes a literal pattern. ```js example-bad @@ -61,7 +59,7 @@ re.test("aa"); // false re.test("a{1, 3}"); // true ``` -This behavior is fixed in the [`u`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, where braces cannot appear literally without [escaping](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape). The ability to use `{` and `}` literally without escaping is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +This behavior is fixed in the [`u`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, where braces cannot appear literally without [escaping](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape). The ability to use `{` and `}` literally without escaping is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /a{1, 3}/u; // SyntaxError: Invalid regular expression: Incomplete quantifier @@ -107,7 +105,7 @@ Quantifiers apply to a single atom. If you want to quantify a longer pattern or /^*/; // SyntaxError: Invalid regular expression: nothing to repeat ``` -In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, [lookahead assertions](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion) can be quantified. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp) and you should not rely on it. +In non-[unicode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) mode, [lookahead assertions](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion) can be quantified. This is a [deprecated syntax for web compatibility](/en-US/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp), and you should not rely on it. ```js /(?=a)?b/.test("b"); // true; the lookahead is matched 0 time diff --git a/files/en-us/web/javascript/reference/regular_expressions/unicode_character_class_escape/index.md b/files/en-us/web/javascript/reference/regular_expressions/unicode_character_class_escape/index.md index eb52b40250d807c..9234177129f406f 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/unicode_character_class_escape/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/unicode_character_class_escape/index.md @@ -30,11 +30,11 @@ A **unicode character class escape** is a kind of [character class escape](/en-U `\p` and `\P` are only supported in [unicode mode](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode). In non-unicode mode, they are [identity escapes](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape) for the `p` or `P` character. -Every Unicode character has a set of properties that describe it. For example, the character [`a`](https://util.unicode.org/UnicodeJsps/character.jsp?a=0061) has the `General_Category` property with value `Lowercase_Letter`, and the `Script` property with value `Latn`. The `\p` and `\P` escape sequences allow you to match a character based on its properties. +Every Unicode character has a set of properties that describe it. For example, the character [`a`](https://util.unicode.org/UnicodeJsps/character.jsp?a=0061) has the `General_Category` property with value `Lowercase_Letter`, and the `Script` property with value `Latn`. The `\p` and `\P` escape sequences allow you to match a character based on its properties. For example, `a` can be matched by `\p{Lowercase_Letter}` (the `General_Category` property name is optional) as well as `\p{Script=Latn}`. To compose multiple properties, see [pattern subtraction and intersection](/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion#pattern_subtraction_and_intersection). - + ## Examples diff --git a/files/en-us/web/javascript/reference/regular_expressions/wildcard/index.md b/files/en-us/web/javascript/reference/regular_expressions/wildcard/index.md index 0fd51399a4dd0a1..da7070356ae8002 100644 --- a/files/en-us/web/javascript/reference/regular_expressions/wildcard/index.md +++ b/files/en-us/web/javascript/reference/regular_expressions/wildcard/index.md @@ -15,7 +15,7 @@ A **wildcard** matches all characters except line terminators. It also matches l ## Description -`.` matches any character except [line terminators](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#line_terminators). If the [`s`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/dotAll) flag is set, it also matches line terminators. +`.` matches any character except [line terminators](/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#line_terminators). If the [`s`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/dotAll) flag is set, `.` also matches line terminators. The exact character set matched by `.` depends on whether the [`u`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) flag is set. If the `u` flag is set, `.` matches any Unicode codepoint; otherwise, it matches any UTF-16 code unit. For example: