在xlabels上用总行和日期绘制堆积条形图 [英] Plotting stacked bars with a total line and dates on xlabels

查看:112
本文介绍了在xlabels上用总行和日期绘制堆积条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用熊猫图来生成堆积的条形图,该条形图的行为与matplotlib的行为不同,但是日期总是以错误的格式出现,因此我无法更改它. 我也想在图表上画一条总计"线.但是,当我尝试添加它时,先前的条被删除了. 我想制作一张下面的图表(由excel生成).黑线是条形的总和.

I am using the pandas plot to generate a stacked bar chart, which has a different behaviour from matplotlib's, but the dates always come out with a bad format and I could not change it. I would also like to a "total" line on the chart. But when I try to add it, the previous bars are erased. I want to make a chart like the one below (generated by excel). The black line is the sum of the bars.

我已经在线查看了一些解决方案,但是它们仅在没有太多条形的情况下才看起来不错,因此标签之间会留出一些空间.

I've looked at some solutions online, but they only look good when there are not many bars, so you get some space between the labels.

这是我能做的最好的事情,下面是我使用的代码.

Here is the best I could do and below there is the code I used.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as plticker

# DATA (not the full series from the chart)
dates = ['2016-10-31', '2016-11-30', '2016-12-31', '2017-01-31', '2017-02-28', '2017-03-31',
         '2017-04-30', '2017-05-31', '2017-06-30', '2017-07-31', '2017-08-31', '2017-09-30',
         '2017-10-31', '2017-11-30', '2017-12-31', '2018-01-31', '2018-02-28', '2018-03-31',
         '2018-04-30', '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31', '2018-09-30',
         '2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31', '2019-02-28', '2019-03-31']

variables = {'quantum ex sa': [6.878011, 6.557054, 3.229360, 3.739318, 1.006442, -0.117945,
                               -1.854614, -2.882032, -1.305225, 0.280100, 0.524068, 1.847649,
                               5.315940, 4.746596, 6.650303, 6.809901, 8.135243, 8.127328,
                               9.202209, 8.146417, 6.600906, 6.231881, 5.265775, 3.971435,
                               2.896829, 4.307549, 4.695687, 4.696656, 3.747793, 3.366878],
             'price ex sa': [-11.618681, -9.062433, -6.228452, -2.944336, 0.513788, 4.068517,
                             6.973203, 8.667524, 10.091766, 10.927501, 11.124805, 11.368854,
                             11.582204, 10.818471, 10.132152, 8.638781, 6.984159, 5.161404,
                             3.944813, 3.723371, 3.808564, 4.576303, 5.170760, 5.237303,
                             5.121998, 5.502981, 5.159970, 4.772495, 4.140812, 3.568077]}

df = pd.DataFrame(index=pd.to_datetime(dates), data=variables)

# PLOTTING
ax = df.plot(kind='bar', stacked=True, width=1)
# df['Total'] = df.sum(axis=1)
# df['Total'].plot(ax=ax)
ax.axhline(0, linewidth=1)
ax.yaxis.set_major_formatter(plticker.PercentFormatter())

plt.tight_layout()
plt.show()

编辑

这是最适合我的方法.这比使用熊猫df.plot(kind='bar', stacked=True)更好,因为它可以更好地在x轴上设置日期标签的格式,还可以为条形图提供任意数量的系列.

Edit

This is what work best for me. This works better than using the pandas df.plot(kind='bar', stacked=True) because it allows for better formatting of the date labels in the x axis and also allows for any number of series for the bars.

    for count, col in enumerate(df.columns):
        old = df.iloc[:, :count].sum(axis=1)
        bottom_series = ((old >= 0) == (df[col] >= 0)) * old

        ax.bar(df.index, df[col], label=col, bottom=bottom_series, width=31)

    df['Total'] = df.sum(axis=1)
    ax.plot(df.index, df['Total'], color='black', label='Total')

推荐答案

这就是您想要的:

fig, ax = plt.subplots(1,1, figsize=(16,9))
# PLOTTING
ax.bar(df.index, df['price ex sa'], bottom=df['quantum ex sa'],width=31, label='price ex sa')
ax.bar(df.index, df['quantum ex sa'], width=31, label='quantum ex sa')

total = df.sum(axis=1)
ax.plot(total.index, total, color='r', linewidth=3, label='total')

ax.legend()
plt.show()

在使用日期时间进行绘图时似乎存在一个错误(功能).我试图将索引转换为字符串,并且可以正常工作:

There seems to be a bug (features) on plotting with datetime. I tried to convert the index to string and it works:

df.index=df.index.strftime('%Y-%m')

ax = df.plot(kind='bar', stacked=True, width=1)
df['Total'] = df.sum(axis=1)
df['Total'].plot(ax=ax, label='total')
ax.legend()

我想我知道发生了什么事.问题是

Edit 2: I think I know what's going on. The problem is that

ax = df.plot(kind='bar', stacked=True)

ax的x轴返回/设置为range(len(df)),并用df.index中的相应值标记,而不是df.index本身.这就是为什么如果我们在相同的ax上绘制第二个序列,则它不会显示(由于xaxis的比例不同).所以我尝试了:

returns/sets x-axis of ax to range(len(df)) labeled by the corresponding values from df.index, but not df.index itself. That's why if we plot the second series on the same ax, it doesn't show (due to different scale of xaxis). So I tried:

# PLOTTING
colums = df.columns

ax = df.plot(kind='bar', stacked=True, width=1, figsize=(10, 6))
ax.plot(range(len(df)), df.sum(1), label='Total')
ax.legend()
plt.show()

它可以按预期工作

这篇关于在xlabels上用总行和日期绘制堆积条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆