用于数据帧循环的 pandas 提供了太多无法解包的值 [英] Pandas for loop over dataframe gives too many values to unpack
问题描述
我不明白为什么此代码不起作用?我正在尝试遍历一个数据帧,在这种情况下,for循环中只有一行?只有两列,并且我有两个for循环变量来使用它们?我想念什么?
I don't see why this code isn't working? I am trying to iterate over a data frame, which in this case only has one row in a for loop? There are only two columns and I have two for loop variables to take them? what am I missing please?
print("process_list = ",process_list)
for row in process_list.itertuples():
print("row = ", row)
df_to_date = pd.DataFrame()
try:
print("process_list = {} and it's type {} process_list.itertuples() {} ".format(process_list, type(process_list),process_list.itertuples() ) )
for file_date , file_name in process_list.itertuples(): # a whole batch of days
file_to_process = dev_env + file_name
print("PROCESSING BATCH: ",file_to_process)
df = pd.read_csv(file_to_process, header=None,skiprows=22, sep=',', comment='*', converters = {"Days" : just_number,"Percentile" : just_number,"Date" : just_number} ,names = column_names )
df.insert(0,'File_date',file_date)
df_to_date = df_to_date.append(df)
except Exception as e:
print ("nothing to process exception = ",e)
sys.exit(0)
当我运行它时,我会得到
when I run it I get
process_list = File_date File_name
94 20180507 mcmhv20180507.csv
row = Pandas(Index=94, File_date=20180507, File_name='mcmhv20180507.csv')
process_list = File_date File_name
94 20180507 mcmhv20180507.csv and it's type <class 'pandas.core.frame.DataFrame'> process_list.itertuples() <map object at 0x7f6339371e48>
nothing to process exception = too many values to unpack (expected 2)
推荐答案
There are two options to account for this.
选项1
打开3个物品的包装,而不是2个,不要使用其中的第一个.
Unpack 3 items instead of 2, the first of which you do not use.
这是一个最小的示例:
df = pd.DataFrame([[10, 20], [30, 40], [50, 60]],
columns=['A', 'B'])
for idx, a, b in df.itertuples():
print(idx, a, b)
0 10 20
1 30 40
2 50 60
在您的情况下,可以使用的一个很好的约定是通过_
指示未使用的变量:
In your case, a good convention to use would be to indicate an unused variable by _
:
for _, file_date, file_name in process_list[['date', 'name']].itertuples():
# do something
选项2
使用index=False
参数并解压缩2个元素:
Use index=False
argument and unpack 2 elements:
for file_date, file_name in process_list[['date', 'name']].itertuples(index=False):
# do something
该行为在文档中指出:
DataFrame.内容( index = True,名称="Pandas" )
将DataFrame行迭代为namedtuple,索引值作为第一个 元组的元素.
Iterate over DataFrame rows as namedtuples, with index value as first element of the tuple.
这篇关于用于数据帧循环的 pandas 提供了太多无法解包的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!