matplotlib 在散点图中不显示图例 [英] matplotlib does not show legend in scatter plot

查看:308
本文介绍了matplotlib 在散点图中不显示图例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决一个聚类问题,为此我需要为我的聚类绘制散点图.

%matplotlib 内联导入 matplotlib.pyplot 作为 pltdf = pd.merge(dataframe,actual_cluster)plt.scatter(df['x'], df['y'], c=df['cluster'])plt.legend()plt.show()

<块引用>

df['cluster'] 是实际的簇号.所以我希望这是我的颜色代码.

它向我展示了一个情节,但它没有向我展示图例.它也不会给我错误.

我做错了什么吗?

解决方案

生成一些随机数据:

from scipy.cluster.vq import kmeans2将熊猫导入为 pd导入 matplotlib.pyplot 作为 plt将 seaborn 作为 sns 导入n_clusters = 10df = pd.DataFrame({'x':np.random.randn(1000), 'y':np.random.randn(1000)})_, df['cluster'] = kmeans2(df, n_clusters)

更新

  • 使用 seaborn.relplotkind='scatter' 或使用 seaborn.scatterplot
    • 指定hue='cluster'

# 图形级绘图sns.relplot(数据=df,x='x',y='y',hue='cluster',palette='tab10',kind='scatter')

# 轴水平图图,轴 = plt.subplots(figsize=(6, 6))sns.scatterplot(数据=df,x='x',y='y',hue='cluster',palette='tab10',ax=axes)axis.legend(loc='center left', bbox_to_anchor=(1, 0.5))

原答案

绘图(matplotlib v3.3.4):

fig, ax = plt.subplots(figsize=(8, 6))cmap = plt.cm.get_cmap('jet')对于我,集群在 df.groupby('cluster'):_ = ax.scatter(cluster['x'], cluster['y'], color=cmap(i/n_clusters), label=i, ec='k')ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

结果:

说明:

不要过多了解 matplotlib 内部结构的细节,一次绘制一个集群就可以解决问题.更具体地说,ax.scatter() 返回一个 PathCollection 对象,我们在这里明确丢弃了该对象,但似乎在内部传递给某种的图例处理程序.一次绘制仅生成一个 PathCollection/label 对,而一次绘制一个簇生成 n_clusters PathCollection/label 对.您可以通过调用 ax.get_legend_handles_labels() 来查看这些对象,它返回如下内容:

([,<matplotlib.collections.PathCollection at 0x7f60c2ff9d68>,<matplotlib.collections.PathCollection at 0x7f60c2ff9390>,<matplotlib.collections.PathCollection at 0x7f60c2f802e8>,<matplotlib.collections.PathCollection at 0x7f60c2f809b0>,<matplotlib.collections.PathCollection at 0x7f60c2ff9908>,<matplotlib.collections.PathCollection at 0x7f60c2f85668>,<matplotlib.collections.PathCollection at 0x7f60c2f8cc88>,<matplotlib.collections.PathCollection at 0x7f60c2f8c748>,<matplotlib.collections.PathCollection at 0x7f60c2f92d30>],['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])

所以实际上ax.legend()等价于ax.legend(*ax.get_legend_handles_labels()).

注意:

  1. 如果使用 Python 2,请确保 i/n_clustersfloat

  2. 省略 fig, ax = plt.subplots() 并使用 plt. 代替ax. 工作正常,但我总是更喜欢明确地指定我正在使用的 Axes 对象,而不是隐式使用当前轴"(plt.gca()).


旧的简单解决方案

如果您可以使用颜色条(而不是离散值标签),您可以使用 Pandas 内置的 Matplotlib 功能:

df.plot.scatter('x', 'y', c='cluster', cmap='jet')

I am trying to work on a clustering problem for which I need to plot a scatter plot for my clusters.

%matplotlib inline
import matplotlib.pyplot as plt
df = pd.merge(dataframe,actual_cluster)
plt.scatter(df['x'], df['y'], c=df['cluster'])
plt.legend()
plt.show()

df['cluster'] is the actual cluster number. So I want that to be my color code.

It shows me a plot but it does not show me the legend. it does not give me error as well.

Am I doing something wrong?

解决方案

EDIT:

Generating some random data:

from scipy.cluster.vq import kmeans2
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

n_clusters = 10
df = pd.DataFrame({'x':np.random.randn(1000), 'y':np.random.randn(1000)})
_, df['cluster'] = kmeans2(df, n_clusters)

Update

  • Use seaborn.relplot with kind='scatter' or use seaborn.scatterplot
    • Specify hue='cluster'

# figure level plot
sns.relplot(data=df, x='x', y='y', hue='cluster', palette='tab10', kind='scatter')

# axes level plot
fig, axes = plt.subplots(figsize=(6, 6))
sns.scatterplot(data=df, x='x', y='y', hue='cluster', palette='tab10', ax=axes)
axes.legend(loc='center left', bbox_to_anchor=(1, 0.5))

Original Answer

Plotting (matplotlib v3.3.4):

fig, ax = plt.subplots(figsize=(8, 6))
cmap = plt.cm.get_cmap('jet')
for i, cluster in df.groupby('cluster'):
    _ = ax.scatter(cluster['x'], cluster['y'], color=cmap(i/n_clusters), label=i, ec='k')
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

Result:

Explanation:

Not going too much into nitty gritty details of matplotlib internals, plotting one cluster at a time sort of solves the issue. More specifically, ax.scatter() returns a PathCollection object which we are explicitly throwing away here but which seems to be passed internally to some sort of legend handler. Plotting all at once generates only one PathCollection/label pair, while plotting one cluster at a time generates n_clusters PathCollection/label pairs. You can see those objects by calling ax.get_legend_handles_labels() which returns something like:

([<matplotlib.collections.PathCollection at 0x7f60c2ff2ac8>,
  <matplotlib.collections.PathCollection at 0x7f60c2ff9d68>,
  <matplotlib.collections.PathCollection at 0x7f60c2ff9390>,
  <matplotlib.collections.PathCollection at 0x7f60c2f802e8>,
  <matplotlib.collections.PathCollection at 0x7f60c2f809b0>,
  <matplotlib.collections.PathCollection at 0x7f60c2ff9908>,
  <matplotlib.collections.PathCollection at 0x7f60c2f85668>,
  <matplotlib.collections.PathCollection at 0x7f60c2f8cc88>,
  <matplotlib.collections.PathCollection at 0x7f60c2f8c748>,
  <matplotlib.collections.PathCollection at 0x7f60c2f92d30>],
 ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])

So actually ax.legend() is equivalent to ax.legend(*ax.get_legend_handles_labels()).

NOTES:

  1. If using Python 2, make sure i/n_clusters is a float

  2. Omitting fig, ax = plt.subplots() and using plt.<method> instead of ax.<method> works fine, but I always prefer to explicitly specify the Axes object I am using rather then implicitly use the "current axes" (plt.gca()).


OLD SIMPLE SOLUTION

In case you are ok with a colorbar (instead of discrete value labels), you can use Pandas built-in Matplotlib functionality:

df.plot.scatter('x', 'y', c='cluster', cmap='jet')

这篇关于matplotlib 在散点图中不显示图例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆