to_datetime值错误:至少必须指定[年,月,日] pandas [英] to_datetime Value Error: at least that [year, month, day] must be specified Pandas

查看:127
本文介绍了to_datetime值错误:至少必须指定[年,月,日] pandas 的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从两个不同的CSV中读取,每个CSV的列中都有日期值.在read_csv之后,我想使用to_datetime方法将数据转换为datetime.每个CSV中的日期格式略有不同,尽管在to_datetime格式参数中注明并指定了差异,但一个格式转换正确,而另一个格式则返回以下值错误.

I am reading from two different CSVs each having date values in their columns. After read_csv I want to convert the data to datetime with the to_datetime method. The formats of the dates in each CSV are slightly different, and although the differences are noted and specified in the to_datetime format argument, the one converts fine, while the other returns the following value error.

ValueError: to assemble mappings requires at least that [year, month, day] be sp
ecified: [day,month,year] is missing

第一个dte.head()

first dte.head()

0  10/14/2016  10/17/2016  10/19/2016    8/9/2016  10/17/2016   7/20/2016
1   7/15/2016   7/18/2016   7/20/2016    6/7/2016   7/18/2016   4/19/2016
2   4/15/2016   4/14/2016   4/18/2016   3/15/2016   4/18/2016   1/14/2016
3   1/15/2016   1/19/2016   1/19/2016  10/19/2015   1/19/2016  10/13/2015
4  10/15/2015  10/14/2015  10/19/2015   7/23/2015  10/14/2015   7/15/2015

此数据框可以使用以下代码很好地转换:

this dataframe converts fine using the following code:

dte = pd.to_datetime(dte, infer_datetime_format=True)

dte = pd.to_datetime(dte[x], format='%m/%d/%Y')

第二个dtd.head()

the second dtd.head()

0   2004-01-02 2004-01-02  2004-01-09 2004-01-16  2004-01-23  2004-01-30
1   2004-01-05 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
2   2004-01-06 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
3   2004-01-07 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
4   2004-01-08 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06

此csv不能使用以下任何一种进行转换:

this csv doesn't convert using either:

dtd = pd.to_datetime(dtd, infer_datetime_format=True)

dtd = pd.to_datetime(dtd, format='%Y-%m-%d')

它返回上面的值错误.有趣的是,使用parse_dates和infer_datetime_format作为read_csv方法的参数可以正常工作.这里发生了什么?

It returns the value error above. Interestingly, however, using the parse_dates and infer_datetime_format as arguments of the read_csv method work fine. What is going on here?

推荐答案

您可以 stack / pd.to_datetime / unstack

pd.to_datetime(dte.stack()).unstack()

说明
pd.to_datetime 适用于字符串,列表或 pd.Series . dte pd.DataFrame ,这就是您遇到问题的原因. dte.stack()生成一个 pd.Series ,其中所有行都堆叠在一起.但是,以这种堆叠的形式,因为它是一个 pd.Series ,所以我可以得到一个矢量化的 pd.to_datetime 进行处理.随后的 unstack 只需反转初始的 stack 即可获得 dte

explanation
pd.to_datetime works on a string, list, or pd.Series. dte is a pd.DataFrame and is why you are having issues. dte.stack() produces a a pd.Series where all rows are stacked on top of each other. However, in this stacked form, because it is a pd.Series, I can get a vectorized pd.to_datetime to work on it. the subsequent unstack simply reverses the initial stack to get the original form of dte

这篇关于to_datetime值错误:至少必须指定[年,月,日] pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆