错误: pandas 数字列上的代码因字符串格式错误而中断 [英] error: code on pandas numerical column breaks with string formatting error
问题描述
我正在读取带有pandas
的表,并且其中一列的日期格式为YYYYMMMDD.到目前为止,在我所有的尝试中,它都作为数字列读取.
I am reading in a table with pandas
, and one of the columns has dates in the format YYYYMMMDD. It is read in as a numerical column in all my attempts so far.
我可以先用笨拙的代码正确地(尽管是缓慢地)消化它,但随后的当前版本会以我不理解的方式出现故障.
I could digest it correctly (though slowly) with clunky code first, but then the current version hiccups in a way I don't understand.
所以,这可行:
treatments['month'] = treatments['INDATUMA'] % 10000
treatments['day'] = treatments['INDATUMA'] % 100
treatments['month'] = (treatments['month']-treatments['day'])/100
(尽管这是最后一次在较小的数据帧中运行,但当前版本是在所有这些数据帧的连接上运行的.在较小的测试数据中,代码仍然可以正常运行,并破坏了整个数据.)
(Though this ran last time in smaller data frames, the current version ran on the concatenation of all of them. In smaller test data, the code still runs fine, and breaks on the entire data.)
此中断:
all_treatments['month'] = all_treatments.INDATUMA % 10000 // 100
这是错误消息:
File "treatments2_noiopro.py", line 92, in <module>
all_treatments['month'] = all_treatments.INDATUMA % 10000 // 100
File "/home/seidav/anaconda/lib/python2.7/site-packages/pandas/core/ops.py", line 532, in wrapper
return left._constructor(wrap_results(na_op(lvalues, rvalues)),
File "/home/seidav/anaconda/lib/python2.7/site-packages/pandas/core/ops.py", line 479, in na_op
result[mask] = op(x[mask], y)
TypeError: not all arguments converted during string formatting
我正在Linux下使用pandas 0.16.2 np19py26_0和python 2.7.10 0版本.
I am using versions pandas 0.16.2 np19py26_0 and python 2.7.10 0 under Linux.
推荐答案
我认为最简单的方法是在最终的串联数据帧上使用pandas
本机日期时间功能,例如
I think the easiest way to do this is to use pandas
native datetime functionality on the final concatenated dataframe, e.g.
treatments['date'] = pandas.to_datetime(treatments['INDATUMA'])
#Now you can split up the date easy as pie
treatments['year'] = treatments['date'].dt.year
treatments['month'] = treatments['date'].dt.month
treatments['day'] = treatments['date'].dt.day
已更新
这篇关于错误: pandas 数字列上的代码因字符串格式错误而中断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!