Pandas 1.1.0 apply 函数在原地改变行 [英] Pandas 1.1.0 apply function is altering the row in place
问题描述
我有一个小的 DF(2 行 x 4 列).还有一个函数,一旦 apply
被执行,它会根据一些逻辑添加一个额外的列.使用 Pandas 0.24.2
我一直在这样做 df.apply(func, axis=1)
并且我会得到我的额外列.到目前为止,一切都很好.
I have a small DF (2rows x 4cols). And a function that will add an extra column depending on some logic, once the apply
is performed. With Pandas 0.24.2
I've been doing this as df.apply(func, axis=1)
and I would get my extra column. So far, so good.
现在使用 Pandas 1.1.0
会发生一些奇怪的事情:当我 apply
时,第一行被处理两次,第二行甚至不被考虑.
Now with Pandas 1.1.0
something weird happens: when I apply
, the first row is processed twice, and the second row is not even considered.
我将展示原始DF、预期的DF和函数.我添加了一个 print(row)
这样你就可以看到 DF 的第一个 row
在这个过程中是如何重复的.
I will show the original DF, the expected one, and the function. I added a print(row)
so you can see how the first row
of the DF is repeated in the process.
In [82]: df_attr_list
Out[82]:
name attrName string_value dict_value
0 FW12611 HW type None ALU1
1 FW12612 HW type None ALU1
现在,函数及其输出...
Now, the function, and its output ...
def setFinalValue(row):
rtrName = row['name']
attrName = row['attrName'].replace(" ","")
dict_value = row['dict_value']
string_value = row['string_value']
finalValue = 'N/A'
if attrName in ['Val1','Val2','Val3']:
finalValue = dict_value
elif attrName in ['Val4','Val5',]:
finalValue = string_value
else:
finalValue = "N/A"
row['finalValue'] = finalValue
print(row)
return row
现在,apply
之后的输出...
Now, the output after the apply
...
In [83]: df_attr_list.apply(setFinalValue, axis=1)
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 0, dtype: object
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 1, dtype: object
Out[83]:
name attrName string_value dict_value finalValue
0 FW12611 HW type None ALU1 ALU1
1 FW12611 HW type None ALU1 ALU1
如您所见,添加了额外的列,但原始DF的第一行处理了两次,好像第二行不存在...
As you can see, the extra column is added, but the first row of the original DF is processed twice, as if the second didn't exist ...
为什么会这样?
我已经在 pandas 1.1.0 中尝试了这个......
I'm already trying this out with pandas 1.1.0...
In [86]: print(pd.__version__)
1.1.0
谢谢!
推荐答案
- 根据 Pandas 1.1.0 What's New Doc: apply 和 applymap on DataFrame 只计算第一行/列一次,
.apply
不计算第一行两次. - 问题是,当返回
row
时,数据框被替换.- 这似乎是 BUG:DataFrame.apply with func altering row in-地方#35633
- 另请参阅分支 1.1.x 上的 Backport PR #35633(错误:DataFrame.apply使用 func 就地更改行)#35666
- As per Pandas 1.1.0 What's New Doc: apply and applymap on DataFrame evaluates first row/column only once,
.apply
does not evaluate the first row twice. - The issue is, the dataframe is replaced when
row
is returned.- This seems to be a result of BUG: DataFrame.apply with func altering row in-place #35633
- Also see Backport PR #35633 on branch 1.1.x (BUG: DataFrame.apply with func altering row in-place) #35666
import pandas as pd data = {'name': ['FW12611', 'FW12612', 'FW12613'], 'attrName': ['HW type', 'HW type', 'HW type'], 'string_value': ['None', 'None', 'None'], 'dict_value': ['ALU1', 'ALU1', 'ALU1']} df = pd.DataFrame(data) def setFinalValue(row): print(row) rtrName = row['name'] attrName = row['attrName'].replace(" ","") dict_value = row['dict_value'] string_value = row['string_value'] finalValue = 'N/A' if attrName in ['Val1','Val2','Val3']: finalValue = dict_value elif attrName in ['Val4','Val5',]: finalValue = string_value else: finalValue = "N/A" print('\n') return finalValue # apply the function df['finalValue'] = df.apply(setFinalValue, axis=1) [out]: name FW12611 attrName HW type string_value None dict_value ALU1 Name: 0, dtype: object name FW12612 attrName HW type string_value None dict_value ALU1 Name: 1, dtype: object name FW12613 attrName HW type string_value None dict_value ALU1 Name: 2, dtype: object # display(df) name attrName string_value dict_value finalValue 0 FW12611 HW type None ALU1 N/A 1 FW12612 HW type None ALU1 N/A 2 FW12613 HW type None ALU1 N/A
这篇关于Pandas 1.1.0 apply 函数在原地改变行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- This seems to be a result of BUG: DataFrame.apply with func altering row in-place #35633
- 这似乎是 BUG:DataFrame.apply with func altering row in-地方#35633