如何让Pandas将包含NaT的列从timedelta转换为datetime? [英] How can I make Pandas convert a column which contains NaT from timedelta to datetime?
问题描述
我有一个pandas数据框,其列的类型为 timedelta64 [ns]
,并且我想将其转换为ot datetime64 [ns]
.
I have a pandas dataframe with a column which is of type timedelta64[ns]
, and which I would like to convert ot datetime64[ns]
.
pd.to_datetime()
函数据说可以做到这一点,并且过去曾奏效,但现在似乎失败了.我认为这可能与API怪癖有关,而该怪癖已经超出了我的关注范围.当前它失败并显示:
The pd.to_datetime()
function purports to do just that, and has worked in the past, but appears to fail now. I would assume this might be related to an API quirk which has gone beneath my radar. Currently it fails with:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 724, in to_datetime
cache_array = _maybe_cache(arg, format, cache, convert_listlike)
File "/usr/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 152, in _maybe_cache
cache_dates = convert_listlike(unique_dates, format)
File "/usr/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 363, in _convert_listlike_datetimes
arg, _ = maybe_convert_dtype(arg, copy=False)
File "/usr/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py", line 1916, in maybe_convert_dtype
raise TypeError(f"dtype {data.dtype} cannot be converted to datetime64[ns]")
TypeError: dtype timedelta64[ns] cannot be converted to datetime64[ns]
要尝试复制,请使用下面的MWE:
wget https://chymera.eu/ppb/61ebad.csv
python
import pandas as pd
df = pd.read_csv('61ebad.csv')
df['Animal_death_date'] = pd.to_timedelta(df['Animal_death_date'], errors='coerce')
df['Animal_death_date'] = pd.to_datetime(df['Animal_death_date'], errors='coerce')
如果我使用 errors ='ignore'
,也会发生该错误.作为参考,我正在使用Pandas 1.0.1
.
The error also occurs if I am using errors='ignore'
.
For reference, I am using Pandas 1.0.1
.
推荐答案
如果需要将时间增量转换为日期时间,请添加一些开始日期时间:
If need convert timedeltas to datetime, add some start datetime:
import pandas as pd
df = pd.read_csv('https://chymera.eu/ppb/61ebad.csv')
start = pd.to_datetime('2000-01-01')
df['Animal_death_date'] = pd.to_timedelta(df['Animal_death_date'], errors='coerce') + start
print (df['Animal_death_date'] )
0 NaT
1 NaT
2 NaT
3 NaT
4 NaT
843 NaT
844 NaT
845 2000-05-12 19:00:00
846 2000-05-12 19:00:00
847 2000-05-12 19:00:00
Name: Animal_death_date, Length: 848, dtype: datetime64[ns]
或添加一些由日期时间填充的列:
Or add some column filled by datetimes:
import pandas as pd
df = pd.read_csv('https://chymera.eu/ppb/61ebad.csv')
start = pd.to_datetime(df['FMRIMeasurement_date'])
df['Animal_death_date'] = pd.to_timedelta(df['Animal_death_date'], errors='coerce') + start
print (df['Animal_death_date'] )
0 NaT
1 NaT
2 NaT
3 NaT
4 NaT
843 NaT
844 NaT
845 2018-10-04 19:20:54
846 2018-10-04 19:20:54
847 2018-10-04 19:20:54
Name: Animal_death_date, Length: 848, dtype: datetime64[ns]
这篇关于如何让Pandas将包含NaT的列从timedelta转换为datetime?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!