Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid isfile calls on find as much as possible #676

Merged
merged 2 commits into from
Jun 21, 2021

Conversation

skshetry
Copy link
Contributor

@skshetry skshetry commented Jun 18, 2021

In find, if walk also includes the path if it's a file, it should already be included in the out.
So, if out is not empty, we can safely assume that it is either a directory or a file. So, we don't need
to make an extra isfile call.

Though, if the path was an empty directory or if the walk does not return the file, we'll still be making an isfile call, which is unfortunate but better than before, where we were making an additional call for each find.

This was noticed in DVC when trying to migrate to WebdavFS, which was making remote cache querying noticeably slower when traversing through the caches.

Please let me know if I'm missing something here, or anything else that needs to be considered here. :)

If `walk` is also returning the same file, it should already have
been included in the `out`. If `out` is not empty, we can
safely assume that it is either a directory or a file.

If it's empty, it could be that it is either an empty directory
or, the path is a file and was not included from the `walk` results.
This time we will make a `isfile` call. This is unfortunate but
better than before, where we were making an additional for each find.

This was noticed in DVC, which was making remote cache querying slower
when traversing through the caches (512 calls instead of 256 :( ).
@martindurant
Copy link
Member

It looks good and passes tests; let me think about your argument, whether is covers all cases.

@martindurant
Copy link
Member

Looking at my own code, I'm not totally certain what that condition is supposed to achieve. Could you please make sure that there's a test that does find on a path that happens to be a file, with both detail=True|False?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants