使用pandas.DataFrame.plot方法时的Timeserie Datetick问题 [英] Timeserie datetick problems when using pandas.DataFrame.plot method
问题描述
当使用 pandas.DataFrame
的 plot
方法时,我刚刚发现了一些非常奇怪的东西.我正在使用熊猫 0.19.1
.这是我的MWE:
I just discovered something really strange when using plot
method of pandas.DataFrame
. I am using pandas 0.19.1
. Here is my MWE:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
t = pd.date_range('1990-01-01', '1990-01-08', freq='1H')
x = pd.DataFrame(np.random.rand(len(t)), index=t)
fig, axe = plt.subplots()
x.plot(ax=axe)
plt.show(axe)
xt = axe.get_xticks()
当我尝试格式化我的xticklabel时,我得到奇怪的行为,然后我抽出了一些要理解的对象,发现了以下内容:
When I try to format my xticklabels I get strange beahviours, then I insepcted objects to understand and I have found the following:
-
t [-1]-t [0] = Timedelta('7 days 00:00:00')
,确认DateTimeIndex
是我所期望的; -
xt = [175320,175488]
,xticks
是整数,但它们不等于自纪元以来的天数(我不知道它是什么是); -
xt [-1]-xt [0] = 168
有更多类似的索引,与len(x)= 169
相同. li>
t[-1] - t[0] = Timedelta('7 days 00:00:00')
, confirming theDateTimeIndex
is what I expect;xt = [175320, 175488]
,xticks
are integers but they are not equals to a number of days since epoch (I do not have any idea about what it is);xt[-1] - xt[0] = 168
there are more like index, there is the same amount thatlen(x) = 169
.
这说明了为什么我无法成功使用以下方式格式化斧头
This explains why I cannot succed to format my axe using:
axe.xaxis.set_major_locator(mdates.HourLocator(byhour=(0,6,12,18)))
axe.xaxis.set_major_formatter(mdates.DateFormatter("%a %H:%M"))
第一个引发一个错误,该错误产生的滴答声很多第二个显示我的第一个刻度是 Fri 00:00
,但应该是 Mon 00:00
(实际上, matplotlib
假定第一个刻度为是 0481-01-03 00:00
,哎呀,这是我的错误所在).
The first raise an error that there is to many ticks to generate
The second show that my first tick is Fri 00:00
but it should be Mon 00:00
(in fact matplotlib
assumes the first tick to be 0481-01-03 00:00
, oops this is where my bug is).
在 pandas
和 matplotlib
整数到日期的转换之间似乎有些不兼容,但是我无法找到解决此问题的方法.
It looks like there is some incompatibility between pandas
and matplotlib
integer to date conversion but I cannot find out how to fix this issue.
如果我改为跑步:
fig, axe = plt.subplots()
axe.plot(x)
axe.xaxis.set_major_formatter(mdates.DateFormatter("%a %H:%M"))
plt.show(axe)
xt = axe.get_xticks()
一切正常,但是我错过了 pandas.DataFrame.plot
方法的所有出色功能,例如曲线标注等.这里的 xt = [726468.726475.]
.
Everything works as expected but I miss all cool features from pandas.DataFrame.plot
method such as curve labeling, etc. And here xt = [726468. 726475.]
.
如何使用 pandas.DataFrame.plot
方法而不是 axe.plot
正确设置刻度线格式,并避免出现此问题?
How can I properly format my ticks using pandas.DataFrame.plot
method instead of axe.plot
and avoiding this issue?
更新
问题似乎与用于日期表示的基础数字的来源和规模(单位)有关.无论如何,即使将其强制为正确的类型,我也无法控制它:
The problem seems to be about origin and scale (units) of underlying numbers for date representation. Anyway I cannot control it, even by forcing it to the correct type:
t = pd.date_range('1990-01-01', '1990-01-08', freq='1H', origin='unix', units='D')
matplotlib和熊猫表示形式之间存在差异.而且我找不到有关此问题的任何文档.
There is a discrepancy between matplotlib and pandas representation. And I could not find any documentation of this problem.
推荐答案
这是您要干的吗?请注意,我缩短了date_range以便于查看标签.
Is this what you are going for? Note I shortened the date_range to make it easier to see the labels.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
t = pd.date_range('1990-01-01', '1990-01-04', freq='1H')
x = pd.DataFrame(np.random.rand(len(t)), index=t)
# resample the df to get the index at 6-hour intervals
l = x.resample('6H').first().index
# set the ticks when you plot. this appears to position them, but not set the label
ax = x.plot(xticks=l)
# set the display value of the tick labels
ax.set_xticklabels(l.strftime("%a %H:%M"))
# hide the labels from the initial pandas plot
ax.set_xticklabels([], minor=True)
# make pretty
ax.get_figure().autofmt_xdate()
plt.show()
这篇关于使用pandas.DataFrame.plot方法时的Timeserie Datetick问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!