Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename dims independently from coords? #3026

Closed
nedclimaterisk opened this issue Jun 17, 2019 · 11 comments · Fixed by #3045
Closed

Rename dims independently from coords? #3026

nedclimaterisk opened this issue Jun 17, 2019 · 11 comments · Fixed by #3045

Comments

@nedclimaterisk
Copy link
Contributor

I have a dataset that looks like this:

<xarray.Dataset>
Dimensions:  (lat: 226, lon: 261, time: 7300)
Coordinates:
  * lat      (lat) float32 -32.0 -31.9 -31.8 -31.7 ... -9.700001 -9.6 -9.5
  * lon      (lon) float32 132.0 132.1 132.2 132.3 ... 157.7 157.8 157.9 158.0
  * time     (time) object 1980-01-01 15:00:00 ... 1999-12-31 15:00:00
Data variables:
    rnd24    (time, lat, lon) float32 ...

Problem description

I would like to be able to rename the dataset dimensions, without renaming the coordinates.

Expected Output

<xarray.Dataset>
Dimensions:  (y: 226, x: 261, time: 7300)
Coordinates:
  * lat      (y) float32 -32.0 -31.9 -31.8 -31.7 ... -9.700001 -9.6 -9.5
  * lon      (x) float32 132.0 132.1 132.2 132.3 ... 157.7 157.8 157.9 158.0
  * time     (time) object 1980-01-01 15:00:00 ... 1999-12-31 15:00:00
Data variables:
    rnd24    (time, y, x) float32 ...

As far as I can tell, there is no way to do this. I can rename the existing dims/coords to x/y, and then manually create new coordinates that are copies of x and y, which gets me to:

<xarray.Dataset>
Dimensions:  (time: 7300, x: 261, y: 226)
Coordinates:
  * y        (y) float32 -32.0 -31.9 -31.8 -31.7 ... -9.700001 -9.6 -9.5
  * x        (x) float32 132.0 132.1 132.2 132.3 ... 157.7 157.8 157.9 158.0
  * time     (time) object 1980-01-01 15:00:00 ... 1999-12-31 15:00:00
    lat      (y) float32 -32.0 -31.9 -31.8 -31.7 ... -9.700001 -9.6 -9.5
    lon      (x) float32 132.0 132.1 132.2 132.3 ... 157.7 157.8 157.9 158.0
Data variables:
    rnd24    (time, y, x) float32 ...

But it doesn't seem to be possible to re-assign the new coordinates as the indexes for the existing dims.

In this case, it may seem a bit redundant, because the coordinates are equal to the grid. But I'm trying to get this output to work with code that also deals with other datasets that have non- rectilinear grids.

@shoyer
Copy link
Member

shoyer commented Jun 17, 2019

Yes, this would be really nice. We should have a separate rename_dims() method that only renames dimensions.

@nedclimaterisk
Copy link
Contributor Author

Is there even any work-around at the moment? I could try a pull request, if I could figure it out.

@jukent
Copy link
Contributor

jukent commented Jun 20, 2019

I am looking into this, but first I am making sure I understand the relationship between coordinates and dimensions. Could you give me an example of when you would use this functionality?

@jukent
Copy link
Contributor

jukent commented Jun 20, 2019

import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature')
ds.coords['y'] = ('lat',ds.lat)
ds.coords['x'] = ('lon',ds.lon)
ds_swappeddims = ds.swap_dims({'lat':'y','lon':'x'})
ds_swappeddims.drop(['x','y'])

Screen Shot 2019-06-20 at 3 49 00 PM

Screen Shot 2019-06-20 at 3 49 14 PM

Screen Shot 2019-06-20 at 4 02 45 PM

Did you do something like this?/Is this what you are after but would like a cleaner solution?

@nedclimaterisk
Copy link
Contributor Author

For example, you might have a rectilinear grid in a non-geodetic coordinate system, for example a rotated pole:

Dimensions:       (period: 7, x: 3, y: 3)
Coordinates:
    height        float64 2.0
    lon           (y, x) float64 137.6 138.1 138.6 137.6 ... 137.7 138.1 138.6
    lat           (y, x) float64 -21.71 -21.73 -21.74 ... -20.83 -20.85 -20.86
  * y             (y) float64 7.92 8.36 8.8
  * x             (x) float64 176.5 176.9 177.4
  * period        (period) object '1961-1980' '1981-2000' ... '2081-2100'
Data variables:
    loc           (period, y, x) float64 311.2 311.4 311.2 ... 316.0 316.0 315.6
    scale         (period, y, x) float64 1.231 1.223 1.192 ... 0.8923 0.8934
    shape         (period, y, x) float64 0.4358 0.4261 ... -0.001062 0.009387
    rotated_pole  |S1 b''

Obviously this case is a little more complicated, because x/y and lon/lat don't have a one-to-one correspondence. But I want to be able to write code that works with both cases, ideally without special casing stuff. Being able to have separately named dim and coordinate variables would help a lot.

However, now that I look at this example, I guess in my case I have both x/y and lat/lon in the coordinates anyway, so maybe I should just duplicate them, with different names.

@jukent
Copy link
Contributor

jukent commented Jun 24, 2019

Should we close this issue?

@shoyer
Copy link
Member

shoyer commented Jun 24, 2019

I think swap_dims solves many of these use-cases, but it would be better to have a rename_dims method that worked more directly.

@jukent
Copy link
Contributor

jukent commented Jun 24, 2019

I will submit a pull request to add separate rename dims and rename coords functions in a day or two.

@jukent
Copy link
Contributor

jukent commented Jun 25, 2019

Created pull request #3042

@jukent
Copy link
Contributor

jukent commented Jul 1, 2019

This issue can be closed, and is addressed in pull request #3045

@dcherian
Copy link
Contributor

dcherian commented Jul 1, 2019

Hi @jukent, if you edit the first comment of your pull request you should be able to change it to Closes #3026 which will automatically close this issue when that PR is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants