Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-89727: Improve os.walk complexity and speed #100671

Merged

Conversation

zmievsa
Copy link
Contributor

@zmievsa zmievsa commented Jan 1, 2023

Use isinstance for checking the top of the stack in os.walk to simplify it and speed it up.

A small benchmark below. "." is this repository.
image

@AlexWaygood AlexWaygood added the performance Performance or resource usage label Jan 2, 2023
@zmievsa
Copy link
Contributor Author

zmievsa commented Jan 2, 2023

@JelleZijlstra I can't assign a reviewer but you merged the prior PR related to os.walk so I decided that it makes sense to ping you here. I hope that's ok :))

@JelleZijlstra JelleZijlstra self-requested a review January 2, 2023 03:40
@JelleZijlstra
Copy link
Member

Do you have a benchmark that shows this actually makes it faster? We lose a tuple allocation but gain a function call, so it's not completely obvious this should be faster.

…edHN.rst

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
@zmievsa
Copy link
Contributor Author

zmievsa commented Jan 2, 2023

I made this MR because my initial benchmarking with pathlib.Path.walk and os.walk was quite obvious. However, after timing it a few more times I have gotten inconsistent results.

image

Seems like the isinstance version is just a tiny bit (insignificantly) faster but I would still argue that it makes more sense than the original because it is slightly simpler.

But what do you think?

@JelleZijlstra
Copy link
Member

Thanks, seems like it's about a wash but since the new code is simpler and we never released the old version, seems fine to change this.

@JelleZijlstra JelleZijlstra merged commit 73097d9 into python:main Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants