Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hue_order not affecting the order if hue data is numerical and continuous palette is chosen #2738

Open
wikiker opened this issue Feb 2, 2022 · 1 comment

Comments

@wikiker
Copy link

wikiker commented Feb 2, 2022

When plotting with seaborn.histplot() and option stacked (and fill), if we specify both hue and hue_order and leave palette blank (or choose any continuous palette), provided that hue is numerical (but categorical nonetheless), the hue_order option is not affecting order of hue kinds.

I've come across this bug when I tried to order boxes in stacked histplot. Although this bug can be easily* (*if we know the number of different kinds of hue) fixed by seaborn.color_palette(name, number), I still think it should be fixed.

Example:

import seaborn
import matplotlib.pyplot as plt
data = seaborn.load_dataset("tips")
summed = data.groupby(["sex", "day"]).agg(sum)

order1=sorted(list(set(summed["size"])), reverse=False)
order2=sorted(list(set(summed["size"])), reverse=True)

plt.figure()
A = seaborn.histplot(data=summed, x="sex", hue="size", multiple="stack")
plt.figure()
B = seaborn.histplot(data=summed, x="sex", hue="size", multiple="stack", hue_order=order1)
plt.figure()
C = seaborn.histplot(data=summed, x="sex", hue="size", multiple="stack", hue_order=order2)
plt.figure()
D = seaborn.histplot(data=summed, x="sex", hue="size", multiple="stack", hue_order=order2, palette=seaborn.color_palette("inferno",len(order2)))
plt.show()

My case:
screenshot

It should be seen that while A,B,C are identical, C,D are different, but (in my opinion) A,B,C should be (pairwise) different and C,D the same.

Seaborn version: 0.11.2, Matplotlib version 3.5.1.

@mwaskom
Copy link
Owner

mwaskom commented Feb 3, 2022

Is this documented as working anywhere?

provided that hue is numerical (but categorical nonetheless)

The problem with this logic is that seaborn can't know that you conceive of your data as categorical if you don't tell it the right way. The most straightforward way would be to encode the numbers with a string or categorical dtype.

Otherwise seaborn has to infer the type of mapping to use from the other arguments provided. The rules for doing so don't consider the value of hue_order, and I don't think they ever have. So this isn't a bug, in the sense that it is not working differently than intended.

That said, I think I agree that the passing an explicit ordering is a reasonable heuristic for inferring that a categorical mapping should be used. And I think it seems reasonably straightforward since the logic is centralized in that function. But at the same time, changing this code could result in different behavior for a number of different functions. I'm not certain that it wouldn't cause any unexpected issues. One potential risk would be in relplot, where forcing the correct inference about the type of mapping to use can be a little tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants