使用 pandas 和Matplotlib将具有自定义DateTime索引的条形图分组 [英] Grouped Bar-Chart with customized DateTime Index using pandas and Matplotlib

查看:49
本文介绍了使用 pandas 和Matplotlib将具有自定义DateTime索引的条形图分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个分组条形图,显示自定义的日期-时间索引 - 只显示月份和年份,而不是完整的日期.我希望将条形分组而不是堆叠.

我认为熊猫可以轻松处理这个问题,使用:

 将熊猫作为pd导入导入 matplotlib.pylab 作为 plt导入 matplotlib.dates 作为 mdatestestdata = pd.DataFrame({"A":[1、2、3],"B":[2,3,1],"C":[2,3,1]},index=pd.to_datetime(pd.DatetimeIndex(数据=[2019-03-02"、2019-04-01"、2019-05-01"])))ax = testdata.plot.bar()

这创建了我想要的情节,我只想将日期更改为更简单的内容,例如 2019 年 3 月、2019 年 4 月、2019 年 5 月.

我认为使用自定义日期格式器可以工作,因此我尝试了

ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))

但是我的标签已经完全消失了.而

定义一个宽度并从 x 值中减去它(通常建议)由于我使用的 DateTime-Index 没有帮助.我收到一个错误,不支持减去 DatetimeIndes 和 float.

  fig,ax = plt.subplots()宽度= 0.8ax.bar(testdata.index-width, testdata["A"])ax.bar(testdata.index,testdata ["B"])ax.bar(testdata.index+width, testdata["C"])ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))plt.show()

所以现在我已经没有足够的想法了,希望能有所投入

解决方案

ax.xaxis.set_major_locator(mdates.MonthLocator())失败的原因是,在引擎盖下,大熊猫对 range(len(df)),然后相应地重命名刻度.

您可以在绘图后获取 xticklabels,并重新格式化:

  ax = testdata.plot.bar()ticks = [tick.get_text() 用于 ax.get_xticklabels() 中的刻度]ticks = pd.to_datetime(ticks).strftime('%b %Y')ax.set_xticklabels(ticks)

给出与 ImpotanceOfBeingErnest 相同的结果:

另一种可能更好的方法是移动每列的条形.当您有很多列并希望减少 xticks 的数量时,这会更有效.

fig, ax = plt.subplots()# 定义移位shift = pd.to_timedelta('1D')#修改每列的基数,可以使用for循环ax.bar(testdata.index + shift,testdata ["A"])ax.bar(testdata.index, testdata["B"])ax.bar(testdata.index-shift,testdata ["C"])ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))plt.show()

输出:

I'd like to create a grouped bar chart that shows a customized Date-Time Index - just showing Month and year instead of the full dates. I want the bars to be grouped and not stacked.

I assumed pandas could handle this easily, using:

import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates

testdata = pd.DataFrame({"A": [1, 2, 3]
                       ,"B": [2, 3, 1]
                       , "C": [2, 3, 1]}  
                       ,index=pd.to_datetime(pd.DatetimeIndex(
                            data=["2019-03-02", "2019-04-01","2019-05-01"])))
ax = testdata.plot.bar()

This creates the plot that I want, I'd just like to change to date into something more simple, like March 2019, April 2019, May 2019.

I assumed using a Custom Date Formatter would work, therefore I tried

ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))

But than my labels are gone completely. And this question implies that pandas and the DateFormatter have a bit of a difficult relationship. Therefore I tried to do it with Matplotlib basics:

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Now the date representation is as expected (although the whitespace could be reduced), but the data overlap, which doesn't help.

Defining a width and subtracting it from the x values (as suggested normally) won't help due to the DateTime-Index I use. I get an error that subtracting DatetimeIndes and float is unsupported.

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index-width, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index+width, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

So now I'm running out of ideas and hope for input

解决方案

The reason ax.xaxis.set_major_locator(mdates.MonthLocator()) fails because under the hood, pandas plots the bars against range(len(df)), then rename the ticks accordingly.

You can grab the xticklabels after you plot, and reformat it:

ax = testdata.plot.bar()

ticks = [tick.get_text() for tick in ax.get_xticklabels()]
ticks = pd.to_datetime(ticks).strftime('%b %Y')
ax.set_xticklabels(ticks)

which gives the same result as ImpotanceOfBeingErnest's:

Another, probably better way is to shift the bars of each columns. This works better when you have many columns and want to reduce the number of xticks.

fig, ax = plt.subplots()

# define the shift
shift = pd.to_timedelta('1D')

# modify the base of each columns, can do with a for loop
ax.bar(testdata.index + shift, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index - shift, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Output:

这篇关于使用 pandas 和Matplotlib将具有自定义DateTime索引的条形图分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆