在pandas DataFrame中存储纯python datetime.datetime [英] Storing pure python datetime.datetime in pandas DataFrame
问题描述
因为matplotlib
不支持这两个 pandas.TimeStamp
或 numpy.datetime64
,其中有没有简单的解决方法,我决定将本机pandas date列转换为纯python datetime.datetime
,以便分散绘制更容易.
但是:
t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31')]})
t.dtypes # date datetime64[ns], as expected
pure_python_datetime_array = t.date.dt.to_pydatetime() # works fine
t['date'] = pure_python_datetime_array # doesn't do what I hoped
t.dtypes # date datetime64[ns] as before, no luck changing it
我猜想pandas会将to_pydatetime
生成的纯python datetime
自动转换为原始格式.我想这通常是很方便的行为,但是有没有办法覆盖它呢?
pandas.Timestamp 仍然是日期时间子类: )
做图的一种方法是将日期时间转换为int64:
In [117]: t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31'), pd.to_datetime('2013-12-31')], 'sample_data': [1, 2]})
In [118]: t['date_int'] = t.date.astype(np.int64)
In [119]: t
Out[119]:
date sample_data date_int
0 2012-12-31 1 1356912000000000000
1 2013-12-31 2 1388448000000000000
In [120]: t.plot(kind='scatter', x='date_int', y='sample_data')
Out[120]: <matplotlib.axes._subplots.AxesSubplot at 0x7f3c852662d0>
In [121]: plt.show()
另一种解决方法是(不要使用散点图,而是...):
In [126]: t.plot(x='date', y='sample_data', style='.')
Out[126]: <matplotlib.axes._subplots.AxesSubplot at 0x7f3c850f5750>
最后,解决方法:
In [141]: import matplotlib.pyplot as plt
In [142]: t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31'), pd.to_datetime('2013-12-31')], 'sample_data': [100, 20000]})
In [143]: t
Out[143]:
date sample_data
0 2012-12-31 100
1 2013-12-31 20000
In [144]: plt.scatter(t.date.dt.to_pydatetime() , t.sample_data)
Out[144]: <matplotlib.collections.PathCollection at 0x7f3c84a10510>
In [145]: plt.show()
这在 github 处存在,该问题现已开放. /p>
Since matplotlib
doesn't support eitherpandas.TimeStamp
ornumpy.datetime64
, and there are no simple workarounds, I decided to convert a native pandas date column into a pure python datetime.datetime
so that scatter plots are easier to make.
However:
t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31')]})
t.dtypes # date datetime64[ns], as expected
pure_python_datetime_array = t.date.dt.to_pydatetime() # works fine
t['date'] = pure_python_datetime_array # doesn't do what I hoped
t.dtypes # date datetime64[ns] as before, no luck changing it
I'm guessing pandas auto-converts the pure python datetime
produced by to_pydatetime
into its native format. I guess it's convenient behavior in general, but is there a way to override it?
The use of to_pydatetime() is correct.
In [87]: t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31'), pd.to_datetime('2013-12-31')]})
In [88]: t.date.dt.to_pydatetime()
Out[88]:
array([datetime.datetime(2012, 12, 31, 0, 0),
datetime.datetime(2013, 12, 31, 0, 0)], dtype=object)
When you assign it back to t.date
, it automatically converts it back to datetime64
pandas.Timestamp is a datetime subclass anyway :)
One way to do the plot is to convert the datetime to int64:
In [117]: t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31'), pd.to_datetime('2013-12-31')], 'sample_data': [1, 2]})
In [118]: t['date_int'] = t.date.astype(np.int64)
In [119]: t
Out[119]:
date sample_data date_int
0 2012-12-31 1 1356912000000000000
1 2013-12-31 2 1388448000000000000
In [120]: t.plot(kind='scatter', x='date_int', y='sample_data')
Out[120]: <matplotlib.axes._subplots.AxesSubplot at 0x7f3c852662d0>
In [121]: plt.show()
Another workaround is (to not use scatter, but ...):
In [126]: t.plot(x='date', y='sample_data', style='.')
Out[126]: <matplotlib.axes._subplots.AxesSubplot at 0x7f3c850f5750>
And, the last work around:
In [141]: import matplotlib.pyplot as plt
In [142]: t = pd.DataFrame({'date': [pd.to_datetime('2012-12-31'), pd.to_datetime('2013-12-31')], 'sample_data': [100, 20000]})
In [143]: t
Out[143]:
date sample_data
0 2012-12-31 100
1 2013-12-31 20000
In [144]: plt.scatter(t.date.dt.to_pydatetime() , t.sample_data)
Out[144]: <matplotlib.collections.PathCollection at 0x7f3c84a10510>
In [145]: plt.show()
This has an issue at github, which is open as of now.
这篇关于在pandas DataFrame中存储纯python datetime.datetime的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!