Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-101362: Omit path anchor from pathlib.PurePath()._parts #102476

Merged

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Mar 6, 2023

Improve performance of path construction by skipping the addition of the path anchor (drive + root) to the internal _parts list. Rename this attribute to _tail for clarity.

Also:

  • Simplify the implementations of name, parent, and parents a little.
  • Optimize _make_child_relpath() by building the new string path with naive string concatination rather than _format_parsed_parts().
  • Optimize __hash__() and __eq__() by referring to the unsplit case-normalized path.
  • Optimize __lt__() etc by referring to the naively split case-normalized path (i.e. split on sep without consideration of anchor)

The public parts tuple is unaffected.

$ ./python -m timeit -s 'from pathlib import PureWindowsPath as P' 'P("C:/foo/bar.py")'
100000 loops, best of 5: 1.98 usec per loop  # before
200000 loops, best of 5: 1.75 usec per loop  # after

Improve performance of path construction by skipping the addition of the
path anchor (`drive + root`) to the internal `_parts` list. This change
allows us to simplify the implementations of `joinpath()`, `name`,
`parent`, and `parents` a little. The public `parts` tuple is unaffected.
@barneygale
Copy link
Contributor Author

barneygale commented Mar 6, 2023

I'm not sure about this change - I don't like how it makes _parts and parts have different contents. Too subtle for future maintainers to pick up on, perhaps?

edit: addressed by renaming _parts to _tail.

@barneygale barneygale added performance Performance or resource usage topic-pathlib labels Mar 6, 2023
@barneygale
Copy link
Contributor Author

barneygale commented Mar 12, 2023

This change is necessary in order to efficiently fix #65238, which is required before we can return unnormalized paths from __fspath__() in #102783.

@barneygale
Copy link
Contributor Author

barneygale commented Apr 3, 2023

Updated timings:

$ ./python -m timeit -s 'from pathlib import PureWindowsPath as P' 'str(P("C:/foo/bar.py"))'
100000 loops, best of 5: 3.92 usec per loop  # before
100000 loops, best of 5: 3.57 usec per loop  # after

Something like a 10% speedup.

@barneygale barneygale merged commit 2c673d5 into python:main Apr 9, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
…ythonGH-102476)

Improve performance of path construction by skipping the addition of the path anchor (`drive + root`) to the internal `_parts` list. Rename this attribute to `_tail` for clarity.
aisk pushed a commit to aisk/cpython that referenced this pull request Apr 18, 2023
…ythonGH-102476)

Improve performance of path construction by skipping the addition of the path anchor (`drive + root`) to the internal `_parts` list. Rename this attribute to `_tail` for clarity.
anjakefala added a commit to saulpw/visidata that referenced this pull request Jun 27, 2023
`pathlib.Path._parts` has been removed in Python 3.12:
python/cpython#102476

Switch to pathlib.Path.parts which is a tuple.

Closes #1934
anjakefala added a commit to saulpw/visidata that referenced this pull request Jun 27, 2023
`pathlib.Path._parts` has been removed in Python 3.12:
python/cpython#102476

Switch to pathlib.Path.parts which is a tuple.

Closes #1934
anjakefala added a commit to saulpw/visidata that referenced this pull request Jul 16, 2023
`pathlib.Path._parts` has been removed in Python 3.12:
python/cpython#102476

Switch to pathlib.Path.parts which is a tuple.

Closes #1934
anjakefala added a commit to saulpw/visidata that referenced this pull request Jul 17, 2023
`pathlib.Path._parts` has been removed in Python 3.12:
python/cpython#102476

Switch to pathlib.Path.parts which is a tuple.

Closes #1934
anjakefala added a commit to saulpw/visidata that referenced this pull request Jul 17, 2023
`pathlib.Path._parts` has been removed in Python 3.12:
python/cpython#102476

Switch to pathlib.Path.parts which is a tuple.

Closes #1934
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-pathlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants