根据 if-elif-else 条件创建新列 [英] Creating a new column based on if-elif-else condition
问题描述
我有一个数据帧 df
:
A B
a 2 2
b 3 1
c 1 3
我想根据以下条件创建一个新列:
I want to create a new column based on the following criteria:
如果行 A == B: 0
if rowA >B: 1
如果行 A
所以给定上表,应该是:
so given the above table, it should be:
A B C
a 2 2 0
b 3 1 1
c 1 3 -1
对于典型的 if else
情况,我做了 np.where(df.A > df.B, 1, -1)
,pandas 是否为一步解决我的问题(无需创建 3 个新列然后组合结果)?
For typical if else
cases I do np.where(df.A > df.B, 1, -1)
, does pandas provide a special syntax for solving my problem with one step (without the necessity of creating 3 new columns and then combining the result)?
推荐答案
将上面列出的一些方法形式化:
To formalize some of the approaches laid out above:
创建一个对数据框行进行操作的函数,如下所示:
Create a function that operates on the rows of your dataframe like so:
def f(row):
if row['A'] == row['B']:
val = 0
elif row['A'] > row['B']:
val = 1
else:
val = -1
return val
然后将其应用于传入 axis=1
选项的数据帧:
Then apply it to your dataframe passing in the axis=1
option:
In [1]: df['C'] = df.apply(f, axis=1)
In [2]: df
Out[2]:
A B C
a 2 2 0
b 3 1 1
c 1 3 -1
当然,这不是矢量化的,因此在扩展到大量记录时性能可能不会那么好.不过,我认为它更具可读性.特别是来自 SAS 背景.
Of course, this is not vectorized so performance may not be as good when scaled to a large number of records. Still, I think it is much more readable. Especially coming from a SAS background.
编辑
这是矢量化版本
df['C'] = np.where(
df['A'] == df['B'], 0, np.where(
df['A'] > df['B'], 1, -1))
这篇关于根据 if-elif-else 条件创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!