to_datetime值错误:至少必须指定[年,月,日] pandas [英] to_datetime Value Error: at least that [year, month, day] must be specified Pandas
问题描述
我正在从两个不同的CSV中读取,每个CSV的列中都有日期值.在read_csv之后,我想使用to_datetime方法将数据转换为datetime.每个CSV中的日期格式略有不同,尽管在to_datetime格式参数中注明并指定了差异,但一个格式转换正确,而另一个格式则返回以下值错误.
I am reading from two different CSVs each having date values in their columns. After read_csv I want to convert the data to datetime with the to_datetime method. The formats of the dates in each CSV are slightly different, and although the differences are noted and specified in the to_datetime format argument, the one converts fine, while the other returns the following value error.
ValueError: to assemble mappings requires at least that [year, month, day] be sp
ecified: [day,month,year] is missing
第一个dte.head()
first dte.head()
0 10/14/2016 10/17/2016 10/19/2016 8/9/2016 10/17/2016 7/20/2016
1 7/15/2016 7/18/2016 7/20/2016 6/7/2016 7/18/2016 4/19/2016
2 4/15/2016 4/14/2016 4/18/2016 3/15/2016 4/18/2016 1/14/2016
3 1/15/2016 1/19/2016 1/19/2016 10/19/2015 1/19/2016 10/13/2015
4 10/15/2015 10/14/2015 10/19/2015 7/23/2015 10/14/2015 7/15/2015
此数据框可以使用以下代码很好地转换:
this dataframe converts fine using the following code:
dte = pd.to_datetime(dte, infer_datetime_format=True)
或
dte = pd.to_datetime(dte[x], format='%m/%d/%Y')
第二个dtd.head()
the second dtd.head()
0 2004-01-02 2004-01-02 2004-01-09 2004-01-16 2004-01-23 2004-01-30
1 2004-01-05 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
2 2004-01-06 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
3 2004-01-07 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
4 2004-01-08 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
此csv不能使用以下任何一种进行转换:
this csv doesn't convert using either:
dtd = pd.to_datetime(dtd, infer_datetime_format=True)
或
dtd = pd.to_datetime(dtd, format='%Y-%m-%d')
它返回上面的值错误.有趣的是,使用parse_dates和infer_datetime_format作为read_csv方法的参数可以正常工作.这里发生了什么?
It returns the value error above. Interestingly, however, using the parse_dates and infer_datetime_format as arguments of the read_csv method work fine. What is going on here?
推荐答案
您可以 stack
/ pd.to_datetime
/ unstack
pd.to_datetime(dte.stack()).unstack()
说明 pd.to_datetime
适用于字符串,列表或 pd.Series
. dte
是 pd.DataFrame
,这就是您遇到问题的原因. dte.stack()
生成一个 pd.Series
,其中所有行都堆叠在一起.但是,以这种堆叠的形式,因为它是一个 pd.Series
,所以我可以得到一个矢量化的 pd.to_datetime
进行处理.随后的 unstack
只需反转初始的 stack
即可获得 dte
explanation
pd.to_datetime
works on a string, list, or pd.Series
. dte
is a pd.DataFrame
and is why you are having issues. dte.stack()
produces a a pd.Series
where all rows are stacked on top of each other. However, in this stacked form, because it is a pd.Series
, I can get a vectorized pd.to_datetime
to work on it. the subsequent unstack
simply reverses the initial stack
to get the original form of dte
这篇关于to_datetime值错误:至少必须指定[年,月,日] pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!