读取文件时Python pandas 中的日期解析错误 [英] Date parse error in Python pandas while reading file

查看:57
本文介绍了读取文件时Python pandas 中的日期解析错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在问题后关注:用于阅读的Python熊猫在带有日期的文件中

我无法解析以下数据框中的日期.代码如下:

I am not able to parse the date on the dataframe below. The code is as follows:

df = pandas.read_csv(file_name, skiprows = 2, index_col='datetime', 
                 parse_dates={'datetime': [0,1,2]}, delim_whitespace=True,
                 date_parser=lambda x: pandas.datetime.strptime(x, '%Y %m %d'))


         OTH-000.opc
              XKN1=    0.500000E-01
    Y   M   D     PRCP     VWC1    
 2006   1   1      0.0  0.17608E+00
 2006   1   2      6.0  0.21377E+00
 2006   1   3      0.1  0.22291E+00
 2006   1   4      3.0  0.23460E+00
 2006   1   5      6.7  0.26076E+00

我收到一条错误消息:lambda()恰好接受1个参数(给定3个参数)

I get an error saying: lambda () takes exactly 1 argument (3 given)

根据以下@EdChum的评论,如果我使用此代码:

Based on @EdChum's comment below, if I use this code:

df = pandas.read_csv(file_name, skiprows = 2, index_col='datetime', parse_dates={'datetime': [0,1,2]}, delim_whitespace=True))

df.index生成一个对象,而不是日期时间序列

df.index results in an object and not a datetime series

df.index
Index([u'2006 1 1',u'2006 1 2'....,u'nan nan nan'],dtype='object')

最后该文件在此处可用:

Finally the file is available here:

https://www.dropbox.com/s/0xgk2w4ed9mi4lx /test.txt?dl=0

推荐答案

好,我看到了问题,您的文件末尾有多余的空行,不幸的是,这在寻找空白时将解析器弄乱了,这导致df看起来如下:

OK I see the problem, your file had extraneous blank lines at the end, unfortunately this messes up the parser as it's looking for whitespace, this caused the df to look the following:

Out[25]:
             PRCP     VWC1
datetime                  
2006 1 1      0.0  0.17608
2006 1 2      6.0  0.21377
2006 1 3      0.1  0.22291
2006 1 4      3.0  0.23460
2006 1 5      6.7  0.26076
nan nan nan   NaN      NaN

当我删除空白行时,它会导入并很好地解析日期:

When I remove the blank lines it imports and parses the dates fine:

Out[26]:
            PRCP     VWC1
datetime                 
2006-01-01   0.0  0.17608
2006-01-02   6.0  0.21377
2006-01-03   0.1  0.22291
2006-01-04   3.0  0.23460
2006-01-05   6.7  0.26076

,索引现在是所需的datetimeindex:

and the index is now a datetimeindex as desired:

In [27]:

df.index
Out[27]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2006-01-01, ..., 2006-01-05]
Length: 5, Freq: None, Timezone: None

这篇关于读取文件时Python pandas 中的日期解析错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆