From 817b3f3fd392f9ff28cfc5c668df52d773388de2 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Mon, 31 Aug 2020 21:54:15 -0300 Subject: [PATCH 01/14] Doc: Fix the array.fromfile method doc The check about the f argument type was removed in this commit: https://github.com/python/cpython/commit/2c94aa567e525c82041ad68a3174d8c3acbf37e2 Thanks for Pedro Arthur Duarte (pedroarthur.jedi at gmail.com) by the help with this bug. --- Doc/library/array.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/Doc/library/array.rst b/Doc/library/array.rst index ad622627724217..4a4b657032bca1 100644 --- a/Doc/library/array.rst +++ b/Doc/library/array.rst @@ -266,4 +266,3 @@ Examples:: `NumPy `_ The NumPy package defines another array type. - From 6b53456da97586af6c199c3c3026637e22a3f7a7 Mon Sep 17 00:00:00 2001 From: Victor Stinner Date: Sun, 2 Jul 2023 18:11:45 +0200 Subject: [PATCH 02/14] gh-106320: Remove private _PyInterpreterState functions (#106335) Remove private _PyThreadState and _PyInterpreterState C API functions: move them to the internal C API (pycore_pystate.h and pycore_interp.h). Don't export most of these functions anymore, but still export functions used by tests. Remove _PyThreadState_Prealloc() and _PyThreadState_Init() from the C API, but keep it in the stable API. From 1b4d15292142ea1d2b08e7c220a4c442d8848ef5 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Sat, 20 Jan 2024 20:42:10 +0000 Subject: [PATCH 03/14] [Doc] Divide RE Syntax in subsections --- Doc/library/re.rst | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 0a8c88b50cdeec..e71dc7e023e97b 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -83,6 +83,12 @@ characters, so ``last`` matches the string ``'last'``. (In the rest of this section, we'll write RE's in ``this special style``, usually without quotes, and strings to be matched ``'in single quotes'``.) + +.. _re-special-characters: + +Special characters +^^^^^^^^^^^^^^^^^^ + Some characters, like ``'|'`` or ``'('``, are special. Special characters either stand for classes of ordinary characters, or affect how the regular expressions around them are interpreted. @@ -93,7 +99,6 @@ directly nested. This avoids ambiguity with the non-greedy modifier suffix repetition to an inner repetition, parentheses may be used. For example, the expression ``(?:a{6})*`` matches any multiple of six ``'a'`` characters. - The special characters are: .. index:: single: . (dot); in regular expressions @@ -514,6 +519,9 @@ The special characters are: .. _re-special-sequences: +Special sequences +^^^^^^^^^^^^^^^^^ + The special sequences consist of ``'\'`` and a character from the list below. If the ordinary character is not an ASCII digit or an ASCII letter, then the resulting RE will match the second character. For example, ``\$`` matches the @@ -655,6 +663,12 @@ character ``'$'``. ``\Z`` Matches only at the end of the string. + +.. _re-escape-sequences: + +Escape sequences +^^^^^^^^^^^^^^^^^ + .. index:: single: \a; in regular expressions single: \b; in regular expressions From 6ad009c23e27911eb29369679c32f496acb19183 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Sat, 20 Jan 2024 20:57:36 +0000 Subject: [PATCH 04/14] [DOC] Add crasis surrounding some RE-matched words --- Doc/library/re.rst | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index e71dc7e023e97b..17d08841000bfc 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -119,9 +119,11 @@ The special characters are: ``$`` Matches the end of the string or just before the newline at the end of the string, and in :const:`MULTILINE` mode also matches before a newline. ``foo`` - matches both 'foo' and 'foobar', while the regular expression ``foo$`` matches + matches both ``'foo'`` and ``'foobar'``, while the regular expression ``foo$`` + matches only 'foo'. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'`` - matches 'foo2' normally, but 'foo1' in :const:`MULTILINE` mode; searching for + matches 'foo2' normally, but ``'foo1'`` in :const:`MULTILINE` mode; searching + for a single ``$`` in ``'foo\n'`` will find two (empty) matches: one just before the newline, and one at the end of the string. @@ -129,21 +131,21 @@ The special characters are: ``*`` Causes the resulting RE to match 0 or more repetitions of the preceding RE, as - many repetitions as are possible. ``ab*`` will match 'a', 'ab', or 'a' followed - by any number of 'b's. + many repetitions as are possible. ``ab*`` will match ``'a'``, ``'ab'``, or + ``'a'`` followed by any number of ``'b'`` s. .. index:: single: + (plus); in regular expressions ``+`` Causes the resulting RE to match 1 or more repetitions of the preceding RE. - ``ab+`` will match 'a' followed by any non-zero number of 'b's; it will not - match just 'a'. + ``ab+`` will match ``'a'`` followed by any non-zero number of ``'b'`` s; it + will not match just ``'a'``. .. index:: single: ? (question mark); in regular expressions ``?`` Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. - ``ab?`` will match either 'a' or 'ab'. + ``ab?`` will match either ``'a'`` or ``'ab'``. .. index:: single: *?; in regular expressions From 94f765fa27296a61f2b53826055d515a25acc852 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Sat, 20 Jan 2024 21:02:40 +0000 Subject: [PATCH 05/14] [DOC] Make clearer what will be matched with a RE --- Doc/library/re.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 17d08841000bfc..bd9d6c90e6b301 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -590,7 +590,7 @@ character ``'$'``. (that is, any character in Unicode character category `[Nd]`__). This includes ``[0-9]``, and also many other digit characters. - Matches ``[0-9]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[0-9]`` if the :py:const:`~re.ASCII` flag is used. __ https://www.unicode.org/versions/Unicode15.0.0/ch04.pdf#G134153 @@ -604,7 +604,7 @@ character ``'$'``. Matches any character which is not a decimal digit. This is the opposite of ``\d``. - Matches ``[^0-9]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[^0-9]`` if the :py:const:`~re.ASCII` flag is used. .. index:: single: \s; in regular expressions @@ -615,7 +615,7 @@ character ``'$'``. non-breaking spaces mandated by typography rules in many languages). - Matches ``[ \t\n\r\f\v]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[ \t\n\r\f\v]`` if the :py:const:`~re.ASCII` flag is used. For 8-bit (bytes) patterns: Matches characters considered whitespace in the ASCII character set; @@ -627,7 +627,7 @@ character ``'$'``. Matches any character which is not a whitespace character. This is the opposite of ``\s``. - Matches ``[^ \t\n\r\f\v]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[^ \t\n\r\f\v]`` if the :py:const:`~re.ASCII` flag is used. .. index:: single: \w; in regular expressions @@ -638,7 +638,7 @@ character ``'$'``. (as defined by :py:meth:`str.isalnum`), as well as the underscore (``_``). - Matches ``[a-zA-Z0-9_]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[a-zA-Z0-9_]`` if the :py:const:`~re.ASCII` flag is used. For 8-bit (bytes) patterns: Matches characters considered alphanumeric in the ASCII character set; @@ -654,7 +654,7 @@ character ``'$'``. By default, matches non-underscore (``_``) characters for which :py:meth:`str.isalnum` returns ``False``. - Matches ``[^a-zA-Z0-9_]`` if the :py:const:`~re.ASCII` flag is used. + Matches only ``[^a-zA-Z0-9_]`` if the :py:const:`~re.ASCII` flag is used. If the :py:const:`~re.LOCALE` flag is used, matches characters which are neither alphanumeric in the current locale From 292672b315e7bf6ff05757097b68b69e071bf5d2 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Sat, 30 Dec 2023 16:00:03 +0000 Subject: [PATCH 06/14] Doc: minor change --- Doc/library/array.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/Doc/library/array.rst b/Doc/library/array.rst index 4a4b657032bca1..ad622627724217 100644 --- a/Doc/library/array.rst +++ b/Doc/library/array.rst @@ -266,3 +266,4 @@ Examples:: `NumPy `_ The NumPy package defines another array type. + From e2023e05bbda091a32074aaf8837bcb27a0de06a Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Mon, 5 Feb 2024 09:49:07 +0000 Subject: [PATCH 07/14] Doc: Put PatternError's attributes inside a table instead of regular paragraph --- Doc/library/re.rst | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index bd9d6c90e6b301..18903398fa528f 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -1170,25 +1170,26 @@ Exceptions error if a string contains no match for a pattern. The ``PatternError`` instance has the following additional attributes: - .. attribute:: msg + .. list-table:: + :header-rows: 1 + + * - Attribute + - Meaning - The unformatted error message. + * - .. attribute:: msg1 + - The unformatted error message. - .. attribute:: pattern + * - .. attribute:: pattern1 + - The regular expression pattern. - The regular expression pattern. + * - .. attribute:: pos1 + - The index in *pattern* where compilation failed (may be ``None``). - .. attribute:: pos + * - .. attribute:: lineno1 + - The line corresponding to *pos* (may be ``None``). - The index in *pattern* where compilation failed (may be ``None``). - - .. attribute:: lineno - - The line corresponding to *pos* (may be ``None``). - - .. attribute:: colno - - The column corresponding to *pos* (may be ``None``). + * - .. attribute:: colno1 + - The column corresponding to *pos* (may be ``None``). .. versionchanged:: 3.5 Added additional attributes. From cdaa9ae9c496e39af80713fad8b32fda57f03ff1 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Mon, 5 Feb 2024 09:51:37 +0000 Subject: [PATCH 08/14] Doc: Fix PatternError's attributes --- Doc/library/re.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 18903398fa528f..1dfe291cc60b88 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -1176,19 +1176,19 @@ Exceptions * - Attribute - Meaning - * - .. attribute:: msg1 + * - .. attribute:: msg - The unformatted error message. - * - .. attribute:: pattern1 + * - .. attribute:: pattern - The regular expression pattern. - * - .. attribute:: pos1 + * - .. attribute:: pos - The index in *pattern* where compilation failed (may be ``None``). - * - .. attribute:: lineno1 + * - .. attribute:: lineno - The line corresponding to *pos* (may be ``None``). - * - .. attribute:: colno1 + * - .. attribute:: colno - The column corresponding to *pos* (may be ``None``). .. versionchanged:: 3.5 From bb98dad681b4c3fe6bb9b6eeb8cc742c829710b8 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Mon, 5 Feb 2024 11:01:56 +0000 Subject: [PATCH 09/14] Doc: fix lint issue --- Doc/library/re.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 1dfe291cc60b88..7d8cf702461633 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -1172,7 +1172,7 @@ Exceptions .. list-table:: :header-rows: 1 - + * - Attribute - Meaning From 6b357afa34068c37f984b3fb9b8e0bd7b22360d6 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Wed, 25 Sep 2024 10:17:36 +0100 Subject: [PATCH 10/14] Doc: Add extension notation header --- Doc/library/re.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index b55d8df25acf95..583bba47a3fb32 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -322,6 +322,12 @@ The special characters are: special sequence, described below. To match the literals ``'('`` or ``')'``, use ``\(`` or ``\)``, or enclose them inside a character class: ``[(]``, ``[)]``. + +.. _re_extension_notation + +Extension notation +^^^^^^^^^^^^^^^^^^ + .. index:: single: (?; in regular expressions ``(?...)`` From 8f7356defb602811ea518c762ca59785404d3361 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Wed, 25 Sep 2024 10:44:50 +0100 Subject: [PATCH 11/14] Doc: Add some more backticks --- Doc/library/re.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 583bba47a3fb32..7ed4fb0796f33c 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -121,8 +121,8 @@ The special characters are: string, and in :const:`MULTILINE` mode also matches before a newline. ``foo`` matches both ``'foo'`` and ``'foobar'``, while the regular expression ``foo$`` matches - only 'foo'. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'`` - matches 'foo2' normally, but ``'foo1'`` in :const:`MULTILINE` mode; searching + only ``'foo'``. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'`` + matches ``'foo2'`` normally, but ``'foo1'`` in :const:`MULTILINE` mode; searching for a single ``$`` in ``'foo\n'`` will find two (empty) matches: one just before the newline, and one at the end of the string. From 9c17aa8fc3107331f96830fb8f62022c186f3c6d Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Thu, 26 Sep 2024 09:00:00 +0100 Subject: [PATCH 12/14] Doc: Fix malformed hyperlink target --- Doc/library/re.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 7ed4fb0796f33c..d49b6b9d06ea24 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -323,7 +323,7 @@ The special characters are: use ``\(`` or ``\)``, or enclose them inside a character class: ``[(]``, ``[)]``. -.. _re_extension_notation +.. _re_extension_notation: Extension notation ^^^^^^^^^^^^^^^^^^ From 17baf98fe5e8504be39322e3f2121fd220480711 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Thu, 3 Oct 2024 22:57:12 +0100 Subject: [PATCH 13/14] Docs: add a 'also' for $ special character and RE examples reference labels --- Doc/library/re.rst | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index d49b6b9d06ea24..f54c4303596c22 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -122,7 +122,7 @@ The special characters are: matches both ``'foo'`` and ``'foobar'``, while the regular expression ``foo$`` matches only ``'foo'``. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'`` - matches ``'foo2'`` normally, but ``'foo1'`` in :const:`MULTILINE` mode; searching + matches ``'foo2'`` normally, but also ``'foo1'`` in :const:`MULTILINE` mode; searching for a single ``$`` in ``'foo\n'`` will find two (empty) matches: one just before the newline, and one at the end of the string. @@ -1601,6 +1601,8 @@ Regular Expression Examples --------------------------- +.. _checking-for-a-pair: + Checking for a Pair ^^^^^^^^^^^^^^^^^^^ @@ -1655,6 +1657,8 @@ To find out what card the pair consists of, one could use the 'a' +.. _simulating-scanf: + Simulating scanf() ^^^^^^^^^^^^^^^^^^ @@ -1742,6 +1746,8 @@ beginning with ``'^'`` will match at the beginning of each line. :: +.. _making-a-phonebook: + Making a Phonebook ^^^^^^^^^^^^^^^^^^ @@ -1803,6 +1809,8 @@ house number from the street name: ['Heather', 'Albrecht', '548.326.4584', '919', 'Park Place']] +.. _text-munging: + Text Munging ^^^^^^^^^^^^ @@ -1823,6 +1831,8 @@ in each word of a sentence except for the first and last characters:: 'Pofsroser Aodlambelk, plasee reoprt yuor asnebces potlmrpy.' +.. _finding-all-adverbs: + Finding all Adverbs ^^^^^^^^^^^^^^^^^^^ @@ -1836,6 +1846,8 @@ the following manner:: ['carefully', 'quickly'] +.. _finding-all-adverbs-and-their-positions: + Finding all Adverbs and their Positions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1852,6 +1864,8 @@ to find all of the adverbs *and their positions* in some text, they would use 40-47: quickly +.. _raw-string-notation: + Raw String Notation ^^^^^^^^^^^^^^^^^^^ @@ -1876,6 +1890,8 @@ functionally identical:: +.. _writing-a-tokenizer: + Writing a Tokenizer ^^^^^^^^^^^^^^^^^^^ From 4e12f7cea1bd3a06cb169db8297a1c6840e88d86 Mon Sep 17 00:00:00 2001 From: Adorilson Bezerra Date: Thu, 3 Oct 2024 23:14:35 +0100 Subject: [PATCH 14/14] Docs: add some RE raw string notation references --- Doc/library/re.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index f54c4303596c22..e8e546cef8938f 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -33,7 +33,8 @@ usage of the backslash in string literals now generate a :exc:`SyntaxWarning` and in the future this will become a :exc:`SyntaxError`. This behaviour will happen even if it is a valid escape sequence for a regular expression. -The solution is to use Python's raw string notation for regular expression +The solution is to use Python's :ref:`raw string notation ` +for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with ``'r'``. So ``r"\n"`` is a two-character string containing ``'\'`` and ``'n'``, while ``"\n"`` is a one-character string containing a @@ -231,7 +232,8 @@ The special characters are: ``'*'``, ``'?'``, and so forth), or signals a special sequence; special sequences are discussed below. - If you're not using a raw string to express the pattern, remember that Python + If you're not using a :ref:`raw string to express the + pattern`, remember that Python also uses the backslash as an escape sequence in string literals; if the escape sequence isn't recognized by Python's parser, the backslash and subsequent character are included in the resulting string. However, if Python would