Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

computation between datasets loses geocoding information #909

Open
ghislainp opened this issue Sep 19, 2019 · 8 comments
Open

computation between datasets loses geocoding information #909

ghislainp opened this issue Sep 19, 2019 · 8 comments
Labels

Comments

@ghislainp
Copy link
Contributor

Describe the bug

I'm computing a ratio of two bands from the same image (OLCI sensor).
scene['new_band'] = scene['Oa05'] / scene['Oa01']
and then resample and save to geotif the result. The resulting image is correct, but not its geolocation.

Inspecting scene['new_band'] shows that the area attrs is not there. In fact, even the resolution is not copied, it is None. The Oa05 and Oa01 have both an area and it is the same (same reference)

To Reproduce

# Your code here
scene.load(['Oa05', 'Oa01'])
scene['new_band'] = scene['Oa05'] / scene['Oa01']

scene['new_band'].area

Expected behavior
I'm expecting scene['new_band'] to have the combined metadata of the two bands used to compute the ratio, including area and resolution.

Actual results

raise an AttributeError because scene['new_band'] has no area attribute.

Environment Info:

  • OS: linux
  • Satpy Version: 0.16.1 (had the same probleme with 0.15)
@djhoese
Copy link
Member

djhoese commented Sep 19, 2019

This is by design in xarray and is not controlled by us in Satpy. Xarray, by default, will drop attributes when doing any operation on a DataArray. If you print scene['new_band'].attrs it should be empty. You will have to manually copy the attributes that you want to the new DataArray object.

There is a satpy.dataset.combine_metadata function to assist with this. You should be able to do:

from satpy.dataset import combine_metadata
scene['new_band'] = scene['0a05'] / scene['0a01']
scene['new_band'].attrs = combine_metadata(scene['0a05'], scene['0a01'])
scene['new_band'].attrs['some_other_key'] = 'whatever_value_you_want'

Another thing is that you may want to create a temporary variable, assign the attributes, and then assign that variable to the Scene object to avoid the Scene adding its own attributes (thinking it knows best) before you have set your own:

new_band = scene['0a05'] / scene['0a01']
new_band.attrs = ...
scene['new_band'] = new_band

Note: You should access DataArray metadata through .attrs[key] instead of .key.

@ghislainp
Copy link
Contributor Author

Thank you for your very clear response.

I found this PR: pydata/xarray#2482 that may be relevant.
I tried: xr.set_options(keep_attrs=True) and it works in my case, without any manual copy of the attrs. Of course, you could not use this global option in satpy, but the PR proposes other options that seems more local.

There is certainly rough corners anyway in this automatic approach...

@djhoese
Copy link
Member

djhoese commented Sep 19, 2019

Exactly right, Satpy can't set that globally without making some major assumptions about who is using it and how. Glad it worked for you. I skimmed the PR and didn't see anything about more local options. It would be nice to have something that we could use in small chunks of Satpy.

@BENR0
Copy link
Collaborator

BENR0 commented Dec 3, 2019

I have come across this lately a lot too. I totally agree with @djhoese that there are no global decisions about which attributes should be kept can be made. But it would be nice if at least information like area would be kept since that is kind of vital for Satpy functionality like resampling and I guess a user would expect this kind of behavior. In any way @djhoese comment would probably make a nice addition to the documentation maybe under https://satpy.readthedocs.io/en/latest/quickstart.html#creating-new-datasets?

@djhoese
Copy link
Member

djhoese commented Dec 3, 2019

@BENR0 One thing that I want to work on is the geoxarray project (there is a repository with a pull request but no releases yet). This would be used by Satpy to standardize this logic for how and where geolocation/projection information is stored in an xarray object. If it were stored as a coordinate variable then it is less likely to get lost in various operations. But depending on how it is implemented (a CRS object stored as a coordinate, WKT stored as a coordinate, etc) you can annoyingly run in to cases where xarray says "yeah the coordinates are different so just drop them" and you have the same problem of the projection information disappearing.

@BENR0
Copy link
Collaborator

BENR0 commented Dec 4, 2019

Yes I have seen geoxarray I think that is a good idea an I know it is not that easy to find a good solution. What's your opinion to adding the information to the documentation so users can be aware of this and find a solution? I can add it and do a PR if your ok with it.

@djhoese
Copy link
Member

djhoese commented Dec 4, 2019

You mean add documentation to satpy about this? I am ok with that if you can think of a good place for it. If you are talking about geoxarray, then I'm not sure.

@BENR0
Copy link
Collaborator

BENR0 commented Dec 5, 2019

yes Satpy documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants