使用 pandas 和Matplotlib将具有自定义DateTime索引的条形图分组 [英] Grouped Bar-Chart with customized DateTime Index using pandas and Matplotlib
问题描述
我想创建一个分组条形图,显示自定义的日期-时间索引 - 只显示月份和年份,而不是完整的日期.我希望将条形分组而不是堆叠.
我认为熊猫可以轻松处理这个问题,使用:
将熊猫作为pd导入导入 matplotlib.pylab 作为 plt导入 matplotlib.dates 作为 mdatestestdata = pd.DataFrame({"A":[1、2、3],"B":[2,3,1],"C":[2,3,1]},index=pd.to_datetime(pd.DatetimeIndex(数据=[2019-03-02"、2019-04-01"、2019-05-01"])))ax = testdata.plot.bar()
这创建了我想要的情节,我只想将日期更改为更简单的内容,例如 2019 年 3 月、2019 年 4 月、2019 年 5 月.
我认为使用自定义日期格式器可以工作,因此我尝试了
ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
但是我的标签已经完全消失了.而
定义一个宽度并从 x 值中减去它(通常建议)由于我使用的 DateTime-Index 没有帮助.我收到一个错误,不支持减去 DatetimeIndes 和 float.
fig,ax = plt.subplots()宽度= 0.8ax.bar(testdata.index-width, testdata["A"])ax.bar(testdata.index,testdata ["B"])ax.bar(testdata.index+width, testdata["C"])ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))plt.show()
所以现在我已经没有足够的想法了,希望能有所投入
ax.xaxis.set_major_locator(mdates.MonthLocator())
失败的原因是,在引擎盖下,大熊猫对 range(len(df))
,然后相应地重命名刻度.
您可以在绘图后获取 xticklabels,并重新格式化:
ax = testdata.plot.bar()ticks = [tick.get_text() 用于 ax.get_xticklabels() 中的刻度]ticks = pd.to_datetime(ticks).strftime('%b %Y')ax.set_xticklabels(ticks)
给出与 ImpotanceOfBeingErnest 相同的结果:
另一种可能更好的方法是移动每列的条形.当您有很多列并希望减少 xticks 的数量时,这会更有效.
fig, ax = plt.subplots()# 定义移位shift = pd.to_timedelta('1D')#修改每列的基数,可以使用for循环ax.bar(testdata.index + shift,testdata ["A"])ax.bar(testdata.index, testdata["B"])ax.bar(testdata.index-shift,testdata ["C"])ax.xaxis.set_major_locator(mdates.MonthLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))plt.show()
输出:
I'd like to create a grouped bar chart that shows a customized Date-Time Index - just showing Month and year instead of the full dates. I want the bars to be grouped and not stacked.
I assumed pandas could handle this easily, using:
import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates
testdata = pd.DataFrame({"A": [1, 2, 3]
,"B": [2, 3, 1]
, "C": [2, 3, 1]}
,index=pd.to_datetime(pd.DatetimeIndex(
data=["2019-03-02", "2019-04-01","2019-05-01"])))
ax = testdata.plot.bar()
This creates the plot that I want, I'd just like to change to date into something more simple, like March 2019, April 2019, May 2019.
I assumed using a Custom Date Formatter would work, therefore I tried
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
But than my labels are gone completely. And this question implies that pandas and the DateFormatter have a bit of a difficult relationship. Therefore I tried to do it with Matplotlib basics:
fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index, testdata["A"])
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()
Now the date representation is as expected (although the whitespace could be reduced), but the data overlap, which doesn't help.
Defining a width and subtracting it from the x values (as suggested normally) won't help due to the DateTime-Index I use. I get an error that subtracting DatetimeIndes and float is unsupported.
fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index-width, testdata["A"])
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index+width, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()
So now I'm running out of ideas and hope for input
The reason ax.xaxis.set_major_locator(mdates.MonthLocator())
fails because under the hood, pandas plots the bars against range(len(df))
, then rename the ticks accordingly.
You can grab the xticklabels after you plot, and reformat it:
ax = testdata.plot.bar()
ticks = [tick.get_text() for tick in ax.get_xticklabels()]
ticks = pd.to_datetime(ticks).strftime('%b %Y')
ax.set_xticklabels(ticks)
which gives the same result as ImpotanceOfBeingErnest's:
Another, probably better way is to shift the bars of each columns. This works better when you have many columns and want to reduce the number of xticks.
fig, ax = plt.subplots()
# define the shift
shift = pd.to_timedelta('1D')
# modify the base of each columns, can do with a for loop
ax.bar(testdata.index + shift, testdata["A"])
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index - shift, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()
Output:
这篇关于使用 pandas 和Matplotlib将具有自定义DateTime索引的条形图分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!