timedelta64和日期时间转换 [英] timedelta64 and datetime conversion
问题描述
我的数据框中有两个日期时间(时间戳)格式的列,分别是 df ['start'],df ['end']
.我想得到两个日期之间的持续时间.因此,我创建了工期列
I have two datetime (Timestamp) formatted columns in my dataframe, df['start'], df['end']
. I'd like to get the duration between the two dates. So I create the duration column
df['duration'] = df['start'] - df['end']
但是,现在 duration
列的格式设置为 numpy.timedelta64
,而不是我期望的 datetime.timedelta
.
However, now the duration
column is formatted as numpy.timedelta64
, instead of datetime.timedelta
as I would expect.
>>> df['duration'][0]
>>> numpy.timedelta64(0,'ns')
而
>>> df['start'][0] - df['end'][0]
>>> datetime.timedelta(0)
有人可以向我解释为什么数组减法会更改 timedelta
类型吗?有没有一种方法可以保存 datetime.timedelta
,因为它更易于使用?
Can someone explain to me why the array subtraction change the timedelta
type? Is there a way that I keep the datetime.timedelta
as it is easier to work with?
推荐答案
这是在熊猫0.15.0中实现Timedelta标量的动机之一.在此处
This was one of the motivations for implementing a Timedelta scalar in pandas 0.15.0. See full docs here
在> = 0.15.0中, timedelta64 [ns]
系列的实现仍然是 np.timedelta64 [ns]
的内幕,但是所有内容都完全隐藏了起来.用户使用 datetime.timedelta
子类标量 Timedelta
(基本上是timedelta和numpy版本的有用超集).
In >= 0.15.0 the implementation of a timedelta64[ns]
Series is still np.timedelta64[ns]
under the hood, but all is completely hidden from the user in a datetime.timedelta
sub-classed scalar, Timedelta
(which is basically a useful superset of timedelta and the numpy version).
In [1]: df = DataFrame([[pd.Timestamp('20130102'),pd.Timestamp('20130101')]],columns=list('AB'))
In [2]: df['diff'] = df['A']-df['B']
In [3]: df.dtypes
Out[3]:
A datetime64[ns]
B datetime64[ns]
diff timedelta64[ns]
dtype: object
# this will return a Timedelta in 0.15.2
In [4]: df['A'][0]-df['B'][0]
Out[4]: datetime.timedelta(1)
In [5]: (df['A']-df['B'])[0]
Out[5]: Timedelta('1 days 00:00:00')
这篇关于timedelta64和日期时间转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!