Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some implementations don't have same function signatures as base class #1100

Open
leoleoasd opened this issue Nov 7, 2022 · 6 comments
Open

Comments

@leoleoasd
Copy link
Contributor

leoleoasd commented Nov 7, 2022

Hi, I noticed some implementations change the default value of some parameters, for example: AbstractFilesystem have detail=True in ls, however LocalFileSystem have detail=False. (https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L301, https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/local.py#L56)
This is inconsistent, for example, people may expect all filesystems to behave the same in ls operations. Also, as listdir just calls ls, if implementations changed the signature of ls, calling it with listdir will use the default value specified in AbstractFileSystem since it's defined there and implementations don't rewrite that. This behavior inconsistency already caused issues (for example, Lightning-AI/pytorch-lightning#3805).

So I wrote a test to find out if there is any more inconsistency, and I found a lot. (https://github.com/leoleoasd/filesystem_spec/blob/master/fsspec/implementations/tests/test_common.py#L65, https://github.com/leoleoasd/filesystem_spec/actions/runs/3407906000/jobs/5667983849)

Is this a bug, should we fix it? The fix is definitely a breaking change and may cause some code to stop working. Maybe we should, first, find what implementation has different signatures, and warn users that use its default value about the change?

@martindurant
Copy link
Member

I agree that the implementations should have the same defaults as each other, and that this default should be defined in the base class. There might be occasional cases where different defaults are appropriate for some given backend, but that should be rare and need justification.

I expect much of our test suite is explicit about the optional arguments like detail=, and this is why we don't see any problem. Likewise, users that access multiple backends are probably explicit.

So yes, we should fix it, but, as you say, this is a tricky thing to do without breaking downstream code. Listing the known instances in the description of this issue would be a great start.

@leoleoasd
Copy link
Contributor Author

I'll make a list tomorrow(it's 12 pm in China), but if I remember it correctly, ls in multiple implementation have this problem. (Maybe local filesystem changed the default value first and other implementations started with copying code from local fs?)

@leoleoasd
Copy link
Contributor Author

Methods that have a different default value:

test_signature[file-get_file] - AssertionError: (self, rpath, lpath, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, outfile=None, **kwargs) != (self, path1, path2, callback=None, **kwargs)
test_signature[file-put_file] - AssertionError: (self, lpath, rpath, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, **kwargs) != (self, path1, path2, callback=None, **kwargs)
test_signature[file-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, **kwargs)
test_signature[memory-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, **kwargs)
test_signature[zip-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, **kwargs)
test_signature[tar-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, **kwargs)
test_signature[webhdfs-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False)
test_signature[github-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, sha=None, _sha=None, **kwargs)
test_signature[dbfs-makedirs] - AssertionError: (self, path, exist_ok=False) != (self, path, exist_ok=True)
test_signature[hdfs-ls] - AssertionError: (self, path, detail=True, **kwargs) != (self, path, detail=False, **kwargs)

Method that missed some parameters in the base class:

test_signature[webhdfs-mkdir] - AssertionError: (self, path, create_parents=True, **kwargs) != (self, path, **kwargs)
test_signature[ftp-mv] - AssertionError: (self, path1, path2, recursive=False, maxdepth=None, **kwargs) != (self, path1, path2, **kwargs)
test_signature[hdfs-mv] - AssertionError: (self, path1, path2, recursive=False, maxdepth=None, **kwargs) != (self, path1, path2, **kwargs)
test_signature[reference-get_file] - AssertionError: (self, rpath, lpath, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, outfile=None, **kwargs) != (self, rpath, lpath, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, **kwargs)
test_signature[ftp-get_file] - AssertionError: (self, rpath, lpath, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, outfile=None, **kwargs) != (self, rpath, lpath, **kwargs)
test_signature[reference-get] - AssertionError: (self, rpath, lpath, recursive=False, callback=<fsspec.callbacks.NoOpCallback object at 0x7fd0a8033eb0>, **kwargs) != (self, rpath, lpath, recursive=False, **kwargs)
test_signature[reference-open] - AssertionError: (self, path, mode='rb', block_size=None, cache_options=None, compression=None, **kwargs) != (self, path, mode='rb', block_size=None, cache_options=None, **kwargs)
test_signature[webhdfs-rm] - AssertionError: (self, path, recursive=False, maxdepth=None) != (self, path, recursive=False, **kwargs)
test_signature[webhdfs-mv] - AssertionError: (self, path1, path2, recursive=False, maxdepth=None, **kwargs) != (self, path1, path2, **kwargs)

@leoleoasd
Copy link
Contributor Author

I think we should always keep the signature unchanged. Other implementations may add parameters, but they should not have fewer parameters or change their default value. If something isn't supported in some FS, the implementation should warn user that give value other than the default value.

@martindurant
Copy link
Member

I agree; but it is plausible that at least some of those instances have correct signatures and defaults, but obfuscated in the kwargs. You are right that it would be better explicitly consistent.

Your code looks like a useful addition and you might consider adding it in a test marked xfail (wouldn't mind seeing how it looks). See also #651

@ianthomas23
Copy link
Collaborator

Just noting that this is related to issue #899.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants