pandas 1.1.0的应用功能在原位置更改行 [英] Pandas 1.1.0 apply function is altering the row in place

查看：96 发布时间：2020/9/6 5:49:27 python pandas dataframe apply

本文介绍了 pandas 1.1.0的应用功能在原位置更改行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个小的DF(2行x 4cols).一旦执行apply，该函数将根据某些逻辑添加额外的列.使用Pandas 0.24.2时，我一直以df.apply(func, axis=1)的身份进行操作，我会得到我的额外专栏.到目前为止，一切都很好.

I have a small DF (2rows x 4cols). And a function that will add an extra column depending on some logic, once the apply is performed. With Pandas 0.24.2 I've been doing this as df.apply(func, axis=1) and I would get my extra column. So far, so good.

现在有了熊猫1.1.0，发生了一些奇怪的事情:当我apply时，第一行被处理两次，而第二行甚至都没有考虑.

Now with Pandas 1.1.0 something weird happens: when I apply, the first row is processed twice, and the second row is not even considered.

我将显示原始DF，预期的DF和功能.我添加了print(row)，因此您可以看到在过程中如何重复DF的第一个row.

I will show the original DF, the expected one, and the function. I added a print(row) so you can see how the first row of the DF is repeated in the process.

In [82]: df_attr_list                                                                                                                                                                                                                        
Out[82]: 
      name attrName string_value dict_value
0  FW12611  HW type         None       ALU1
1  FW12612  HW type         None       ALU1

现在，该函数及其输出...

Now, the function, and its output ...

def setFinalValue(row):
    rtrName      = row['name']
    attrName     = row['attrName'].replace(" ","")
    dict_value   = row['dict_value']
    string_value = row['string_value']
    finalValue   = 'N/A'

    if attrName in ['Val1','Val2','Val3']:
        finalValue = dict_value
    elif attrName in ['Val4','Val5',]:
        finalValue = string_value
    else:
        finalValue = "N/A"
    row['finalValue'] = finalValue

    print(row)
    
    return row

现在，apply ...

In [83]: df_attr_list.apply(setFinalValue, axis=1)                                                                                                                                                                                           
name                       FW12611
attrName                   HW type
string_value                  None
dict_value                    ALU1
finalValue                    ALU1
Name: 0, dtype: object
name                       FW12611
attrName                   HW type
string_value                  None
dict_value                    ALU1
finalValue                    ALU1
Name: 1, dtype: object
Out[83]: 
      name attrName string_value dict_value finalValue
0  FW12611  HW type         None       ALU1       ALU1
1  FW12611  HW type         None       ALU1       ALU1

如您所见，添加了额外的列，但是原始DF的第一行被处理了两次，好像第二行不存在...

As you can see, the extra column is added, but the first row of the original DF is processed twice, as if the second didn't exist ...

为什么会这样?

我已经在尝试使用熊猫1.1.0 ...

I'm already trying this out with pandas 1.1.0...

In [86]: print(pd.__version__)                                                                                                                                                                                                               
1.1.0

谢谢！

推荐答案

按照熊猫1.1.0新增功能:DataFrame上的apply和applymap仅对第一行/列进行一次评估，.apply不会对第一行进行两次评估./li>

问题是，返回row时将替换数据帧.

这似乎是 BUG:DataFrame.apply与func更改行的结果-place#35633

另请参见分支1.1.x上的Backport PR#35633(BUG:DataFrame.apply与功能就地更改行)#35666

As per Pandas 1.1.0 What's New Doc: apply and applymap on DataFrame evaluates first row/column only once, .apply does not evaluate the first row twice.

The issue is, the dataframe is replaced when row is returned.

This seems to be a result of BUG: DataFrame.apply with func altering row in-place #35633

Also see Backport PR #35633 on branch 1.1.x (BUG: DataFrame.apply with func altering row in-place) #35666

import pandas as pd

data = {'name': ['FW12611', 'FW12612', 'FW12613'],
 'attrName': ['HW type', 'HW type', 'HW type'],
 'string_value': ['None', 'None', 'None'],
 'dict_value': ['ALU1', 'ALU1', 'ALU1']}

df = pd.DataFrame(data)


def setFinalValue(row):
    print(row)
    rtrName      = row['name']
    attrName     = row['attrName'].replace(" ","")
    dict_value   = row['dict_value']
    string_value = row['string_value']
    finalValue   = 'N/A'

    if attrName in ['Val1','Val2','Val3']:
        finalValue = dict_value
    elif attrName in ['Val4','Val5',]:
        finalValue = string_value
    else:
        finalValue = "N/A"

    print('\n')
    return finalValue


# apply the function
df['finalValue'] = df.apply(setFinalValue, axis=1)

[out]:
name            FW12611
attrName        HW type
string_value       None
dict_value         ALU1
Name: 0, dtype: object


name            FW12612
attrName        HW type
string_value       None
dict_value         ALU1
Name: 1, dtype: object


name            FW12613
attrName        HW type
string_value       None
dict_value         ALU1
Name: 2, dtype: object

# display(df)
      name attrName string_value dict_value finalValue
0  FW12611  HW type         None       ALU1        N/A
1  FW12612  HW type         None       ALU1        N/A
2  FW12613  HW type         None       ALU1        N/A

这篇关于 pandas 1.1.0的应用功能在原位置更改行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 1.1.0的应用功能在原位置更改行 [英] Pandas 1.1.0 apply function is altering the row in place

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 1.1.0的应用功能在原位置更改行 [英] Pandas 1.1.0 apply function is altering the row in place

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭