Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plot.scatter(hue_style="discrete") does nothing #7907

Closed
4 tasks done
mgunyho opened this issue Jun 12, 2023 · 4 comments · Fixed by #7925
Closed
4 tasks done

plot.scatter(hue_style="discrete") does nothing #7907

mgunyho opened this issue Jun 12, 2023 · 4 comments · Fixed by #7925

Comments

@mgunyho
Copy link
Contributor

mgunyho commented Jun 12, 2023

What happened?

I was trying to do a scatterplot of my data with one dimension determining the color. The dimension has only a few values so I used hue_style="discrete" to have a different color for each value. However, the resulting scatterplot has a continuous colorbar, which is the same as when I pass hue_style="continuous":

image

What did you expect to happen?

The colorbar should have discrete colors. I was also expecting the colors to be from the default matplotlib color palette, C0, C1, etc, when there's less than 10 items, like this:

image

Although the examples in the documentation show the discrete case also using viridis.

What I was really expecting is a plot like one would get by passing add_colorbar=False, add_legend=True:

image

But that may be a bit too automagical.

Minimal Complete Verifiable Example

import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

x = xr.DataArray(
    np.random.default_rng().random((10, 3)),
    coords=[
        ("idx", np.linspace(0, 1, 10)),
        ("color", [1, 2, 3]),
    ]
)
y = x + np.random.default_rng().random(x.shape)

ds = xr.Dataset({
   "x": x,
   "y": y,
})

# the output is the same regardless of hue_style="discrete" or "continuous" or just leaving it out
ds.plot.scatter(x="x", y="y", hue="color", hue_style="discrete", ax=plt.figure().gca())

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

This is the code for the "expected" plot:

from matplotlib.colors import ListedColormap

ds.plot.scatter(
    x="x",
    y="y",
    hue="color",
    hue_style="discrete",
    ax=plt.figure().gca(),

    # these lines added in addition to the MVCE
    cmap=ListedColormap(["C0", "C1", "C2"]),
    vmin=0.5, vmax=3.5,
    cbar_kwargs=dict(ticks=ds.color.data),
)

Environment

INSTALLED VERSIONS

commit: None
python: 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.14.0-1059-oem
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.1.0
pandas: 1.4.3
numpy: 1.23.0
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.5.3
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 44.0.0
pip: 20.0.2
conda: None
pytest: None
mypy: None
IPython: 8.12.2
sphinx: None

I also tried this on main at 3459e6f, the behavior is the same.

@mgunyho mgunyho added bug needs triage Issue that has not been reviewed by xarray team member labels Jun 12, 2023
@mgunyho
Copy link
Contributor Author

mgunyho commented Jun 12, 2023

Possibly related: #7268

@dcherian dcherian added topic-plotting and removed needs triage Issue that has not been reviewed by xarray team member labels Jun 14, 2023
@Illviljan
Copy link
Contributor

Illviljan commented Jun 14, 2023

dataset.plot.scatter doesn't use hue_style anymore.
dataarray.plot.scatter has never had hue_style.

I remember at some point there was handling for hue_style but it's seems I've lost it somewhere in the refactors to handle both categorical and numerics the same way. #4909, #6778
I suppose we could print a warning for the dataset version for a while since it was used a while ago.
It's unfortunate that the dataarray version's docstring suggests it's possible to use hue_style, a copy/paste error I guess.

The idea now is to just choose if you want a legend or a colorbar. You'll get something useful either way. You'll even get hue and sizes in 1 legend if you want that, something that was not previously possible.

Now numbers are considered continuous so if you want them to act as a categorical then switch them to string:

ds["color"] = ds["color"].astype(str)
ds.plot.scatter(x="x", y="y", hue="color", ax=plt.figure().gca())

image

Or if you just want a legend you use add_legend and remove the colorbar:

ds.plot.scatter(
    x="x",
    y="y",
    hue="color",
    ax=plt.figure().gca(),
    add_legend=True,
    add_colorbar=False,
)

image

xarray switches between 2 plotting colormaps, sequential or divergent. If you want a different colormap then you can use xr.set_options(cmap_sequential = "cool")

cmap_divergent : str or matplotlib.colors.Colormap, default: "RdBu_r"
Colormap to use for divergent data plots. If string, must be
matplotlib built-in colormap. Can also be a Colormap object
(e.g. mpl.cm.magma)
cmap_sequential : str or matplotlib.colors.Colormap, default: "viridis"
Colormap to use for nondivergent data plots. If string, must be
matplotlib built-in colormap. Can also be a Colormap object
(e.g. mpl.cm.magma)

ds["color"] = ds["color"].astype(str)
xr.set_options(cmap_sequential = "cool")
ds.plot.scatter(x="x", y="y", hue="color", ax=plt.figure().gca())

image

@mgunyho
Copy link
Contributor Author

mgunyho commented Jun 18, 2023

Thanks for the reply! So are you saying that the hue_style keyword should really be removed entirely, and this is just a mistake in the docs? That would make sense I guess, and my use case is really covered by add_colorbar=False, add_legend=True, even if that's quite a bit of typing. Maybe it feels a bit magical or inconsistent that string-valued coords create a discrete colorbar.

It was maybe a bit surprising to me that hue created a colorbar instead of legend because I was used to the line plot behavior. But it's understandable, because (dataset) scatter plots can have as many values for the hue as there are points.

I can try to submit a patch to remove (or deprecate for dataset) the hue_style option if you think it makes sense.

@mgunyho
Copy link
Contributor Author

mgunyho commented Jun 18, 2023

Oh, I see you started something along these lines in #7925

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants