注释堆叠的barplot matplotlib和pandas [英] Annotate stacked barplot matplotlib and pandas

查看:127
本文介绍了注释堆叠的barplot matplotlib和pandas的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的数据框,用于存储调查结果.列为:

I have a simple Data Frame that stores the results of a survey. The columns are:

| Age | Income | Satisfaction |

所有这些值都包含在15之间的值(分类).我设法生成了一个堆积的条形图,该图显示了Satisfaction值在不同年龄人群中的分布. 代码是:

all of them contains values between 1 and 5 (categorical). I managed to generate a stacked barplot that shows distribution of Satisfaction values across people of different age. The code is:

#create a random df
data = []
for i in range(500):
    sample = {"age" : random.randint(0,5), "income" : random.randint(1,5), "satisfaction" : random.randint(1,5)}
data.append(sample)
df = pd.DataFrame(data)
#group by age
counter = df.groupby('age')['satisfaction'].value_counts().unstack()
#calculate the % for each age group 
percentage_dist = 100 * counter.divide(counter.sum(axis = 1), axis = 0)
percentage_dist.plot.bar(stacked=True)

这将生成以下所需的图:

This generates the following, desired, plot:

但是,很难比较Age-0green子集(百分比)是否高于Age-2中的子集(百分比).因此,有一种方法可以将百分比添加到条形图的每个子部分的顶部.像这样,但对于每个单独的小节:

However, it's difficult to compare if the green subset (percentage) of Age-0 is higher than the one in Age-2. Therefore, is there a way of adding the percentage on top of each sub-section of the barplot. Something like this, but for every single bar:

推荐答案

一种选择是遍历补丁,以获得补丁的宽度,高度和左下角坐标,然后使用此值将标签放置在补丁的中心相应的栏.

One option is to iterate over the patches in order to obtain their width, height and bottom-left coordinates and use this values to place the label at the center of the corresponding bar.

为此,必须存储通过熊猫杆方法返回的轴.

To do this, the axes returned by the pandas bar method must be stored.

ax = percentage_dist.plot.bar(stacked=True)
for p in ax.patches:
    width, height = p.get_width(), p.get_height()
    x, y = p.get_xy() 
    ax.text(x+width/2, 
            y+height/2, 
            '{:.0f} %'.format(height), 
            horizontalalignment='center', 
            verticalalignment='center')

在这里,带注释的值设置为0小数,但是可以轻松修改.

Here, the annotated value is set to 0 decimals, but this can be easily modified.

使用此代码生成的输出图如下:

The output plot generated with this code is the following:

这篇关于注释堆叠的barplot matplotlib和pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆