如何在箱线图中放置多个中值? [英] How to put multiple median values in the boxplot?

查看:136
本文介绍了如何在箱线图中放置多个中值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只发现代码可以将中位数放在boxplot中,所以我尝试了.但是由于我的箱线图是多个,所以它无法获取x-tick获取定位器.我如何找到箱形图的次要刻度定位器,我已经尝试过了,但仍无法获取多个箱形图位置的位置.有什么建议可以改善这个情节吗?

I only found the code can put median in boxplot and I tried it. But since my boxplot is multiple, so it unable to get the x-tick get locator. How can I find the minor tick locator of the boxplot, I already tried it yet still cannot get the location of multiple boxplot location. Any suggestion to improve this plot?

df = pd.DataFrame([['Apple', 10, 'A'],['Apple', 8, 'B'],['Apple', 10, 'C'],
              ['Apple', 5, 'A'],['Apple', 7, 'B'],['Apple', 9, 'C'],
              ['Apple', 3, 'A'],['Apple', 5, 'B'],['Apple', 4, 'C'],
              ['Orange', 3, 'A'],['Orange', 4, 'B'],['Orange', 6, 'C'],
              ['Orange', 2, 'A'],['Orange', 8, 'B'],['Orange', 4, 'C'],
              ['Orange', 8, 'A'],['Orange', 10, 'B'],['Orange', 1, 'C']])

df.columns = ['item', 'score', 'grade']


fig = plt.figure(figsize=(6, 3), dpi=150)

ax = sns.boxplot(x='item', y='score', data=df, hue='grade', palette=sns.color_palette('husl'))
ax.legend(loc='lower right', bbox_to_anchor=(1.11, 0), ncol=1, fontsize = 'x-small').set_title('')

medians = df.groupby(['item','grade'])['score'].median().values
median_labels = [str(np.round(s, 2)) for s in medians]

pos = range(len(medians))
for tick,label in zip(pos, ax.get_xticklabels()):
    ax.text(pos[tick], medians[tick], median_labels[tick], 
            horizontalalignment='center', size='xx-small', color='w', weight='semibold', bbox=dict(facecolor='#445A64'))

推荐答案

众所周知,Seaborn很难使用.下面的代码有效,但是如果其中一个类别为空并且未绘制任何箱线图,则可能会中断,例如,使用后果自负:

Seaborn is notoriously difficult to work with. The code below works but might break if one of the category is empty and no boxplot is drawn for example, use at your own risks:

df = pd.DataFrame([['Apple', 10, 'A'],['Apple', 8, 'B'],['Apple', 10, 'C'],
              ['Apple', 5, 'A'],['Apple', 7, 'B'],['Apple', 9, 'C'],
              ['Apple', 3, 'A'],['Apple', 5, 'B'],['Apple', 4, 'C'],
              ['Orange', 3, 'A'],['Orange', 4, 'B'],['Orange', 6, 'C'],
              ['Orange', 2, 'A'],['Orange', 8, 'B'],['Orange', 4, 'C'],
              ['Orange', 8, 'A'],['Orange', 10, 'B'],['Orange', 1, 'C']])

df.columns = ['item', 'score', 'grade']


width = 0.8
hue_col = 'grade'

fig, plt.figure(figsize=(6, 3), dpi=150)
ax = sns.boxplot(x='item', y='score', data=df, hue=hue_col, palette=sns.color_palette('husl'), width=width)
ax.legend(loc='lower right', bbox_to_anchor=(1.11, 0), ncol=1, fontsize = 'x-small').set_title('')

# get the offsets used by boxplot when hue-nesting is used
# https://github.com/mwaskom/seaborn/blob/c73055b2a9d9830c6fbbace07127c370389d04dd/seaborn/categorical.py#L367
n_levels = len(df[hue_col].unique())
each_width = width / n_levels
offsets = np.linspace(0, width - each_width, n_levels)
offsets -= offsets.mean()

medians = df.groupby(['item','grade'])['score'].median()

for x0,(_,med0) in enumerate(medians.groupby(level=0)):
    for off,(_,med1) in zip(offsets,med0.groupby(level=1)):
        ax.text(x0+off, med1.item(), '{:.0f}'.format(med1.item()), 
            horizontalalignment='center', va='center', size='xx-small', color='w', weight='semibold', bbox=dict(facecolor='#445A64'))

通常,为避免发生任何意外情况,如果要修改海洋图,建议您指定orderhue_order,以便以预定顺序绘制图.这是另一个能够处理缺失类别的版本:

In general, to avoid any surpises, if you want to modify a seaborn plot, I would recommend you specify order and hue_order so that the plot is drawn in a pre-determined order. Here is an other version that is able to deal with a missing category:

df = pd.DataFrame([['Apple', 8, 'B'],['Apple', 10, 'C'],
              ['Apple', 7, 'B'],['Apple', 9, 'C'],
              ['Apple', 5, 'B'],['Apple', 4, 'C'],
              ['Orange', 3, 'A'],['Orange', 6, 'C'],
              ['Orange', 2, 'A'],['Orange', 4, 'C'],
              ['Orange', 8, 'A'],['Orange', 1, 'C']])

df.columns = ['item', 'score', 'grade']


order = ['Apple', 'Orange']
hue_col = 'grade'
hue_order = ['A','B','C']
width = 0.8

fig, plt.figure(figsize=(6, 3), dpi=150)
ax = sns.boxplot(x='item', y='score', data=df, hue=hue_col, palette=sns.color_palette('husl'), width=width,
                order=order, hue_order=hue_order)
ax.legend(loc='lower right', bbox_to_anchor=(1.11, 0), ncol=1, fontsize = 'x-small').set_title('')

# get the offsets used by boxplot when hue-nesting is used
# https://github.com/mwaskom/seaborn/blob/c73055b2a9d9830c6fbbace07127c370389d04dd/seaborn/categorical.py#L367
n_levels = len(df[hue_col].unique())
each_width = width / n_levels
offsets = np.linspace(0, width - each_width, n_levels)
offsets -= offsets.mean()

medians = df.groupby(['item','grade'])['score'].median()
medians = medians.reindex(pd.MultiIndex.from_product([order,hue_order]))

for x0,(_,med0) in enumerate(medians.groupby(level=0)):
    for off,(_,med1) in zip(offsets,med0.groupby(level=1)):
        if not np.isnan(med1.item()):
            ax.text(x0+off, med1.item(), '{:.0f}'.format(med1.item()), 
                horizontalalignment='center', va='center', size='xx-small', color='w', weight='semibold', bbox=dict(facecolor='#445A64'))

这篇关于如何在箱线图中放置多个中值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆