Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in regular expression doc #114357

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

adorilson
Copy link
Contributor

@adorilson adorilson commented Jan 20, 2024

adorilson and others added 6 commits January 20, 2024 20:23
The check about the f argument type was removed in this commit:
python@2c94aa5

Thanks for Pedro Arthur Duarte (pedroarthur.jedi at gmail.com) by the help with
this bug.
…#106335)

Remove private _PyThreadState and _PyInterpreterState C API
functions: move them to the internal C API (pycore_pystate.h and
pycore_interp.h). Don't export most of these functions anymore, but
still export functions used by tests.

Remove _PyThreadState_Prealloc() and _PyThreadState_Init() from the C
API, but keep it in the stable API.
@bedevere-app bedevere-app bot added awaiting review docs Documentation in the Doc dir skip news labels Jan 20, 2024
Copy link
Member

@terryjreedy terryjreedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does 3 things.

  1. Add headers. I have thought to propose the same. Please add 1 more at 320, something like
.. _re_extension_notation

Extension notation
^^^^^^^^^^^^^^^^^^

CHANGE

  1. Add double backticks, either new or extending single backticks. The existing text always put backticks on REs and sometimes on text matched. PR makes that (nearly, 2 expections noted) always on matches. Defensible since this seems the majority of existing cases. CHANGE

  2. Add 'only' in several places. I am not sure these are needed, but I see existing similar uses.

@serhiy-storchaka I want to finish this RE doc change. Any additional comments from you?

Comment on lines 124 to 125
only 'foo'. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'``
matches 'foo2' normally, but 'foo1' in :const:`MULTILINE` mode; searching for
matches 'foo2' normally, but ``'foo1'`` in :const:`MULTILINE` mode; searching
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with other additions, 'foo' above and 'foo2' here should be backticked. But see review summary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bedevere-app
Copy link

bedevere-app bot commented Feb 25, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@serhiy-storchaka
Copy link
Member

I am not sure that there is a need in these changes.

  1. New headers and anchors. It is perhaps harmless, but there are some problems in the text (re: documentation claim that special characters lose their special meaning inside […] seems wrong #106482) which requires more serious rewriting, so some parts of the text can be moved and headers and anchors can change.
  2. I used double backquotes to highlight regular expressions. Single quotes are used for strings, they are not fragments of the Python code, they are just strings in quotes. If use the same style in both cases, it will be more difficult to distinguish REs from strings. Maybe you have better solution?
  3. I have no opinion about "only", I left the decision on the native English users.

@adorilson adorilson marked this pull request as draft September 25, 2024 09:45
@adorilson
Copy link
Contributor Author

Hi, @terryjreedy. Thank you for your review and comments.

The items 1 and 2 are done.

Concern 3: the idea is to make the re.ASCII use more explicit.

Without re.ASCII, [^0-9] is matched, but something more can be matched too, i.e.:

>>> import re
>>> re.findall(r'\d+', '567abc123٠١٢٣٤٥٦٧٨٩')
['567', '123٠١٢٣٤٥٦٧٨٩']

However, with re.ASCII only (and just only) [^0-9] is matched, i.e.:

>>> import re
>>> re.findall(r'\d+', '567abc123٠١٢٣٤٥٦٧٨٩', re.ASCII)
['567', '123']

@adorilson
Copy link
Contributor Author

  1. requires more serious rewriting

This can start adding more in-line examples, like in progress with strings (#119445).

@adorilson adorilson marked this pull request as ready for review September 26, 2024 08:57
@adorilson
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Sep 26, 2024

Thanks for making the requested changes!

@terryjreedy: please review the changes made to this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants