用大 pandas 代替NaT [英] Replacing NaT with Epoch in Pandas
问题描述
如下所示,NaT缺失值出现在数据帧的末尾。这可以理解的是引发ValueError:
文件/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python /pytz/tzinfo.py,第314行本地化
loc_dt = tzinfo.normalize(dt.replace(tzinfo = tzinfo))
ValueError:month必须在1..12
我试图使用dropna:
data [col_name] .dropna(0,inplace = True)
和fillna, 使用缺少的数据部分:
data [col_name] .fillna(0,inplace = True)
pre>
在这两行之前,我尝试通过在时代时代替换非datetimes来清理数据:
data [col_name] = a_col.apply(lambda x:x if isinstance(x,datetime.datetime)else epoch)
由于NaT在技术上是一个日期时间,因此该功能未涵盖此条件。由于 isnull 将处理此问题,我写了这个函数来应用于数据[col_name]:
def replace_time(x):
如果pd.isnull (x):
return epoch
elif isinstance(x,datetime.datetime):
return x
else:
return epoch
尽管它进入pd.isnull部分,但值并没有改变。然而,当我尝试这个系列(第二个值是NaT)的功能时,它的作用是:
s = pd.Series ([pd.Timestamp('20130101'),np.nan,pd.Timestamp('20130102 9:30')],dtype ='M8 [ns]')
pre>
数据:
2003-04-29 00:00:00
NaT
NaT
NaT
解决方案尝试:
data [col_name] = a_col.apply(lambda x:x if isinstance(x,datetime.datetime)
而不是isinstance(x,pd.tslib.NaTType)else epoch)
NaT missing values are appearing at the end of my dataframe as demonstrated below. This understandably raises the ValueError:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pytz/tzinfo.py", line 314, in localize loc_dt = tzinfo.normalize(dt.replace(tzinfo=tzinfo)) ValueError: month must be in 1..12
I've tried to use both dropna:
data[col_name].dropna(0, inplace=True)
and fillna, as encouraged by the Working with Missing Data section:
data[col_name].fillna(0, inplace=True)
Before either of these lines, I tried to clean the data by replacing non-datetimes with the epoch time:
data[col_name] = a_col.apply(lambda x: x if isinstance(x, datetime.datetime) else epoch)
Because NaT is technically a datetime this condition wasn't covered by that function. Since isnull will handle this, I wrote this function to apply to data[col_name]:
def replace_time(x): if pd.isnull(x): return epoch elif isinstance(x, datetime.datetime): return x else: return epoch
Despite the fact that it enters the pd.isnull section, the value isn't changed. However, when I try that function on this series (where the second value is NaT) it works:
s = pd.Series([pd.Timestamp('20130101'),np.nan,pd.Timestamp('20130102 9:30')],dtype='M8[ns]')
Data:
2003-04-29 00:00:00
NaT
NaT
NaT
解决方案Try:
data[col_name] = a_col.apply(lambda x: x if isinstance(x, datetime.datetime) and not isinstance(x, pd.tslib.NaTType) else epoch)
这篇关于用大 pandas 代替NaT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!