pandas 1.1.0的应用功能在原位置更改行 [英] Pandas 1.1.0 apply function is altering the row in place
问题描述
我有一个小的DF(2行x 4cols).一旦执行apply
,该函数将根据某些逻辑添加额外的列.使用Pandas 0.24.2
时,我一直以df.apply(func, axis=1)
的身份进行操作,我会得到我的额外专栏.到目前为止,一切都很好.
I have a small DF (2rows x 4cols). And a function that will add an extra column depending on some logic, once the apply
is performed. With Pandas 0.24.2
I've been doing this as df.apply(func, axis=1)
and I would get my extra column. So far, so good.
现在有了熊猫1.1.0
,发生了一些奇怪的事情:当我apply
时,第一行被处理两次,而第二行甚至都没有考虑.
Now with Pandas 1.1.0
something weird happens: when I apply
, the first row is processed twice, and the second row is not even considered.
我将显示原始DF,预期的DF和功能.我添加了print(row)
,因此您可以看到在过程中如何重复DF的第一个row
.
I will show the original DF, the expected one, and the function. I added a print(row)
so you can see how the first row
of the DF is repeated in the process.
In [82]: df_attr_list
Out[82]:
name attrName string_value dict_value
0 FW12611 HW type None ALU1
1 FW12612 HW type None ALU1
现在,该函数及其输出...
Now, the function, and its output ...
def setFinalValue(row):
rtrName = row['name']
attrName = row['attrName'].replace(" ","")
dict_value = row['dict_value']
string_value = row['string_value']
finalValue = 'N/A'
if attrName in ['Val1','Val2','Val3']:
finalValue = dict_value
elif attrName in ['Val4','Val5',]:
finalValue = string_value
else:
finalValue = "N/A"
row['finalValue'] = finalValue
print(row)
return row
现在,apply
...
In [83]: df_attr_list.apply(setFinalValue, axis=1)
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 0, dtype: object
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 1, dtype: object
Out[83]:
name attrName string_value dict_value finalValue
0 FW12611 HW type None ALU1 ALU1
1 FW12611 HW type None ALU1 ALU1
如您所见,添加了额外的列,但是原始DF的第一行被处理了两次,好像第二行不存在...
As you can see, the extra column is added, but the first row of the original DF is processed twice, as if the second didn't exist ...
为什么会这样?
我已经在尝试使用熊猫1.1.0 ...
I'm already trying this out with pandas 1.1.0...
In [86]: print(pd.__version__)
1.1.0
谢谢!
推荐答案
- 按照熊猫1.1.0新增功能:DataFrame上的apply和applymap仅对第一行/列进行一次评估,
.apply
不会对第一行进行两次评估./li> - 问题是,返回
row
时将替换数据帧.- 这似乎是 BUG:DataFrame.apply与func更改行的结果-place#35633
- 另请参见分支1.1.x上的Backport PR#35633(BUG:DataFrame.apply与功能就地更改行)#35666
- As per Pandas 1.1.0 What's New Doc: apply and applymap on DataFrame evaluates first row/column only once,
.apply
does not evaluate the first row twice. - The issue is, the dataframe is replaced when
row
is returned.- This seems to be a result of BUG: DataFrame.apply with func altering row in-place #35633
- Also see Backport PR #35633 on branch 1.1.x (BUG: DataFrame.apply with func altering row in-place) #35666
import pandas as pd data = {'name': ['FW12611', 'FW12612', 'FW12613'], 'attrName': ['HW type', 'HW type', 'HW type'], 'string_value': ['None', 'None', 'None'], 'dict_value': ['ALU1', 'ALU1', 'ALU1']} df = pd.DataFrame(data) def setFinalValue(row): print(row) rtrName = row['name'] attrName = row['attrName'].replace(" ","") dict_value = row['dict_value'] string_value = row['string_value'] finalValue = 'N/A' if attrName in ['Val1','Val2','Val3']: finalValue = dict_value elif attrName in ['Val4','Val5',]: finalValue = string_value else: finalValue = "N/A" print('\n') return finalValue # apply the function df['finalValue'] = df.apply(setFinalValue, axis=1) [out]: name FW12611 attrName HW type string_value None dict_value ALU1 Name: 0, dtype: object name FW12612 attrName HW type string_value None dict_value ALU1 Name: 1, dtype: object name FW12613 attrName HW type string_value None dict_value ALU1 Name: 2, dtype: object # display(df) name attrName string_value dict_value finalValue 0 FW12611 HW type None ALU1 N/A 1 FW12612 HW type None ALU1 N/A 2 FW12613 HW type None ALU1 N/A
这篇关于 pandas 1.1.0的应用功能在原位置更改行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- This seems to be a result of BUG: DataFrame.apply with func altering row in-place #35633
- 这似乎是 BUG:DataFrame.apply与func更改行的结果-place#35633