根据另一列的值在数据框的列上应用函数，然后进行分组 [英] Apply a function on a column of a dataframe depending on the value of another column and then groupby

查看：79 发布时间：2020/10/17 2:18:16 python-3.x pandas dataframe lambda

本文介绍了根据另一列的值在数据框的列上应用函数，然后进行分组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假定数据框具有 A列和条件列，如下面的代码所示。

Assume the dataframe with column 'A' and column 'condition' as reproduced by the code below.

example = pd.DataFrame({'A': range(10), 'condition': [0,1,0,1,2,0,1,2,2,1]})

如果列 B中的值为0或2，我想将列 A中的值乘以2。所以我尝试了以下方法：

I want to multiply by 2 the values in column 'A' if the values in column 'B' are 0 or 2. So I tried these:

example['A']=example['A'].apply(lambda x: x*2 \
             if example['condition']==0 or example['condition']==2)

example['A']=np.where(example.condition==0 or example.condition==2, \
             lambda x: x*2, example.A)

但这些都不是为了获得所需的输出，如下所示：

but none of these work in order to get the desired output as below:

    output:                 desired output:
    example                 example
       A  B                          A  B
    0  0  0                      0   0  0
    1  1  1                      1   1  1
    2  2  0                      2   4  0
    3  3  1                      3   3  1
    4  4  2                      4   8  2
    5  5  0                      5  10  0
    6  6  1                      6   6  1
    7  7  2                      7  14  2 
    8  8  2                      8  16  2 
    9  9  1                      9   9  1

如果获得所需的输出，我想对条件进行分组，并在 A值大于2.5时计算 A值的绝对总和。我考虑到了这一点，但是如果我没有从上面得到所需的输出，我不确定它是否有效。

If I get the desired output, I want to groupby 'condition' and calculate the absolute summation of 'A' values if the 'A' values are bigger than 2.5. I have this in mind, but I if I do not get the desired output from above I am not sure if it works.

group1=example.groupby([example[condition')['A'].\
       agg([ ('A sum' , lambda x : x[x>=2.5].abs(sum()) ])

有任何建议吗？

推荐答案

首先我们得到条件为0或2 的所有行，然后我们相乘将其中的两行作为 A 值，并在使用查询时使用 GroupBy.sum 过滤所有 A> = 2.5

First we get all the rows where condition is 0 or 2. Then we multiply the A values by two of these rows and use GroupBy.sum while using query to filter all the rows where A >= 2.5

m = example['condition'].isin([0,2])
example['A'] = np.where(m, example['A'].mul(2), example['A'])
grpd = example.query('A.ge(2.5)').groupby('condition', as_index=False)['A'].sum()

输出

   condition   A
0          0  28
1          1  18
2          2  76

详细信息 GroupBy.sum ：

首先，我们使用 query 来获取 A> ; = 2.5 ：

First we use query to get all the rows where A >= 2.5:

example.query('A.ge(2.5)')

    A  condition
2   4          0
3   3          1
4   8          2
5  10          0
6   6          1
7  14          2
8  16          2
9   9          1

然后我们根据条件使用groupby获取每组唯一值，在这种情况下，所有行都具有 0 ， 1 和 2 ：

Then we use groupby on condition to get each group of unique values, in this case all rows with 0, 1 and 2:

for _, d in grpd.groupby('condition', as_index=False):
    print(d, '\n')

    A  condition
2   8          0
5  20          0 

   A  condition
3  3          1
6  6          1
9  9          1 

    A  condition
4  16          2
7  28          2
8  32          2

因此，如果我们有单独的组，则可以使用 .sum 方法将整个 A 列：

So if we have the seperate groups, we can use .sum method to sum the whole A column:

for _, d in grpd.groupby('condition', as_index=False):
    print(d['A'].sum(), '\n')

28 

18 

76

这篇关于根据另一列的值在数据框的列上应用函数，然后进行分组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据另一列的值在数据框的列上应用函数，然后进行分组 [英] Apply a function on a column of a dataframe depending on the value of another column and then groupby

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

根据另一列的值在数据框的列上应用函数，然后进行分组 [英] Apply a function on a column of a dataframe depending on the value of another column and then groupby

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭