pandas 数学运算,取决于列值 [英] Pandas mathematical operation, conditional on column value

查看:79
本文介绍了 pandas 数学运算,取决于列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要进行数学运算,该运算取决于第二列中的值.这是设置.

I need to make a mathematical operation which is conditional on the value in a second column. Here is the setup.

给出一个简单的数据帧(df):

Given a simple dataframe (df):

df = pd.DataFrame({
    'col1' : ['A', 'A', 'B', np.nan, 'D', 'C'],
    'col2' : [2, 1, 9, 8, 7, 4],
    'col3': [0, 1, 9, 4, 2, 3],
    })

In [11]: df
Out[11]: 
  col1  col2  col3
0    A     2     0
1    A     1     1
2    B     9     9
3  NaN     8     4
4    D     7     2
5    C     4     3

我可以添加一个新列(math),然后用基于10和col3之和的数学表达式填充它.

I can add a new columns (math) and then fill it with a mathematical expression based on the sum of 10 and col3.

df['math'] = 10 + df['col3']

In [14]: df
Out[14]: 
  col1  col2  col3  math
0    A     2     0    10
1    A     1     1    11
2    B     9     9    19
3  NaN     8     4    14
4    D     7     2    12
5    C     4     3    13

,但我不知道的是如何使表达式取决于另一列中的值(例如,仅当col1 == B时).所需的输出将是:

but what I can't figure out is how to make the expression conditional on the value in another column (e.g., only if col1 == B). The desired output would be:

In [14]: df
Out[14]: 
  col1  col2  col3  math
0    A     2     0   NaN
1    A     1     1   NaN
2    B     9     9    19
3  NaN     8     4   NaN
4    D     7     2   NaN
5    C     4     3   NaN

为进一步说明,我将在for loop中的col1值中使用变量.结果,我无法按照此处.我想我正在寻找这样的东西...

For added clarification, I will be using a variable for the col1 value in a for loop. As a result, I couldn't get the .group_by() to work as described here or here. I think I'm looking for something like this...

df['math'] = 10 + df.loc[[df['col1'] == my_var], 'col3']

我从上面第二个示例的注释中得到了

,但是我无法使其正常工作.它为太多的值抛出ValueError-也就是说,我试图同时传递过滤器和操作列,但只希望过滤器通过. 帖子还使用了与我类似的.loc上面的表达式-但带有静态col1.

which I got from the comment in the second example above - but I can't get it to work. It throws a ValueError for too many values - - that is, I'm trying to pass both the filter and the column of operation together but it's only expecting the filter. This SO post also uses the .loc similar to my expression above - but with a static col1.

推荐答案

使用 loc

Using loc

df['math'] = df.loc[df.col1.eq('B'), 'col3'].add(10)

  col1  col2  col3  math
0    A     2     0   NaN
1    A     1     1   NaN
2    B     9     9  19.0
3  NaN     8     4   NaN
4    D     7     2   NaN
5    C     4     3   NaN

这篇关于 pandas 数学运算,取决于列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆