用大 pandas 代替NaT [英] Replacing NaT with Epoch in Pandas

查看:174
本文介绍了用大 pandas 代替NaT的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如下所示,NaT缺失值出现在数据帧的末尾。这可以理解的是引发ValueError:


文件/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python /pytz/tzinfo.py,第314行本地化
loc_dt = tzinfo.no​​rmalize(dt.replace(tzinfo = tzinfo))
ValueError:month必须在1..12


我试图使用dropna:

  data [col_name] .dropna(0,inplace = True)

和fillna, 使用缺少的数据部分

  data [col_name] .fillna(0,inplace = True)
pre>

在这两行之前,我尝试通过在时代时代替换非datetimes来清理数据:

  data [col_name] = a_col.apply(lambda x:x if isinstance(x,datetime.datetime)else epoch)

由于NaT在技术上是一个日期时间,因此该功能未涵盖此条件。由于 isnull 将处理此问题,我写了这个函数来应用于数据[col_name]:

  def replace_time(x):
如果pd.isnull (x):
return epoch
elif isinstance(x,datetime.datetime):
return x
else:
return epoch

尽管它进入pd.isnull部分,但值并没有改变。然而,当我尝试这个系列(第二个值是NaT)的功能时,它的作用是:

  s = pd.Series ([pd.Timestamp('20130101'),np.nan,pd.Timestamp('20130102 9:30')],dtype ='M8 [ns]')
pre>

数据:


2003-04-29 00:00:00



NaT



NaT



NaT



解决方案

尝试:

  data [col_name] = a_col.apply(lambda x:x if isinstance(x,datetime.datetime)
而不是isinstance(x,pd.tslib.NaTType)else epoch)


NaT missing values are appearing at the end of my dataframe as demonstrated below. This understandably raises the ValueError:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pytz/tzinfo.py", line 314, in localize loc_dt = tzinfo.normalize(dt.replace(tzinfo=tzinfo)) ValueError: month must be in 1..12

I've tried to use both dropna:

data[col_name].dropna(0, inplace=True)

and fillna, as encouraged by the Working with Missing Data section:

data[col_name].fillna(0, inplace=True)

Before either of these lines, I tried to clean the data by replacing non-datetimes with the epoch time:

data[col_name] = a_col.apply(lambda x: x if isinstance(x, datetime.datetime)  else epoch)

Because NaT is technically a datetime this condition wasn't covered by that function. Since isnull will handle this, I wrote this function to apply to data[col_name]:

def replace_time(x):
if pd.isnull(x):
    return epoch
elif isinstance(x, datetime.datetime):
    return x
else:
    return epoch

Despite the fact that it enters the pd.isnull section, the value isn't changed. However, when I try that function on this series (where the second value is NaT) it works:

s = pd.Series([pd.Timestamp('20130101'),np.nan,pd.Timestamp('20130102 9:30')],dtype='M8[ns]')

Data:

2003-04-29 00:00:00

NaT

NaT

NaT

解决方案

Try:

data[col_name] = a_col.apply(lambda x: x if isinstance(x, datetime.datetime) 
                                       and not isinstance(x, pd.tslib.NaTType) else epoch)

这篇关于用大 pandas 代替NaT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆