-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug when querying unnamed dataarray #5493
Open
TomNicholas
wants to merge
7
commits into
pydata:main
Choose a base branch
from
TomNicholas:query_unnamed
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
9fcfc3e
test querying unnamed da
TomNicholas a469bc7
use _from_temp_dataset if no name
TomNicholas ec59926
what's new
TomNicholas b89acff
removed two unneccessary intermediate variables
TomNicholas 851c405
removed shallow copy
TomNicholas 68a505c
reference values in unnamed dataarrays as 'self'
TomNicholas 3603ccd
replace self with name for named dataarrays
TomNicholas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why we don't just use
_to_temp_dataset
/_from_temp_dataset
regardless of the name?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_to_dataset_whole
is just the same thing as_to_temp_dataset
except it accepts theshallow_copy
argument, which is was already being used here.Using
_to_dataset_whole
withname=None
will fail (so I can't just passself.name
because that will beNone
for an unnamed dataarray). Using_to_dataset_whole
without passing a name fails in the same way.Then trying to use
_from_temp_dataset
fails withso I think it will fail for named dataarrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the
shallow_copy=True
as this is a temporary dataset (and it's the usual pattern for delegating to theDataset
implementation). No need to use thename
parameter for_from_temp_dataset
, either, as it's being restored fromself
.Edit: actually, the failure for
_from_temp_dataset
fails because of thename
parameter to_to_dataset_whole
. In total, I'd suggest to use the same pattern as insel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't work for
query
becausequery
needs the name of the dataarray to be a variable on the temporary dataset, otherwise it can't evaluate queries involving the dataarray name (such asx="a > 5"
whereda.name = 'a'
). The tests will fail with this error if you try that:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically in order for query to work you have to provide a name to construct the temporary dataset with, which rules out using
_to_temp_dataset
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed. However, if I had to choose between
"self"
for unnamed and the name for namedDataArray
objects or always"self"
I'd choose the latter because it's easier to use (and the implementation would be easier)? Not sure if we should try to support both (i.e. allow referencing by name and using"self"
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. What I just wrote supports both individually: You use
the name when it's named, and self when it isn't. But if it's named you
can't use self. I suppose you could support using self even when it's named
by examining the query and replacing parts of it before giving it to
ds.query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I added support for using
'self'
even when the dataarray has a name, by using.replace()
. It works, but makes the implementation more complicated again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to slightly simplify this would be to always rename all dataarray names to
'self'
before querying.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break if there's a non-dimensional coordinate variable named
self
won't it? I'm not sure there's good solution though, perhaps...
:D