使用Pandas 2比较两列 [英] Compare two columns using pandas 2

查看:1048
本文介绍了使用Pandas 2比较两列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在比较数据框中的两列(A和B)。我有一种有效的方法(C5)。它来自以下问题:

解决方案

使用:

  df = pd.DataFrame({'A':[ 1,1,1,1,1,2,2,2,2,2],
'B':[1,1,1,1,1,1,0,0,0,0] })

因此对于 C1 C2 需要按 == eq 表示布尔掩码,然后将其转换为整数-正确,错误 1,0

  df ['C1'] =(df ['A'] == df ['B'])。astype(int)
df ['C2'] = df ['A' ] .eq(df ['B'])。astype(int)

此处是必要的变更单 1,0 -对于匹配条件需要 1

  df ['C3'] = np.where(((df ['A'] == df ['B']),1,0)

函数中未选择Series的值,缺少 row

  def fun(row):
如果row ['A'] == row ['B' ]:
返回1
其他:
返回0
df ['C4'] = df.apply(fun,axis = 1)

解决方案正确:

  df [ 'C5'] = df.apply(lambda x:如果x ['A'] == x ['B']否则为0,轴= 1)
打印(df)
AB C1 C2 C3 C4 C5
0 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1
5 2 1 0 0 0 0
6 2 0 0 0 0 0 0
7 2 0 0 0 0 0 0
8 2 0 0 0 0 0 0
9 2 0 0 0 0 0 0


I'm comparing two columns in a dataframe (A & B). I have a method that works (C5). It came from this question: Compare two columns using pandas

I wondered why I couldn't get the other methods (C1 - C4) to give the correct answer:

df = pd.DataFrame({'A': [1,1,1,1,1,2,2,2,2,2],
                   'B': [1,1,1,1,1,1,0,0,0,0]})

#df['C1'] = 1 [df['A'] == df['B']]

df['C2'] = df['A'].equals(df['B'])

df['C3'] = np.where((df['A'] == df['B']),0,1)

def fun(row):
    if ['A'] == ['B']:
        return 1
    else:
        return 0
df['C4'] = df.apply(fun, axis=1)

df['C5'] = df.apply(lambda x : 1 if x['A'] == x['B'] else 0, axis=1)

解决方案

Use:

df = pd.DataFrame({'A': [1,1,1,1,1,2,2,2,2,2],
                   'B': [1,1,1,1,1,1,0,0,0,0]})

So for C1 and C2 need compare columns by == or eq for boolean mask and then convert it to integers - True, False to 1,0:

df['C1'] = (df['A'] == df['B']).astype(int)
df['C2'] = df['A'].eq(df['B']).astype(int)

Here is necessary change order 1,0 - for match condition need 1:

df['C3'] = np.where((df['A'] == df['B']),1,0)

In function is not selected values of Series, missing row:

def fun(row):
    if row['A'] == row['B']:
        return 1
    else:
        return 0
df['C4'] = df.apply(fun, axis=1)

Solution is correct:

df['C5'] = df.apply(lambda x : 1 if x['A'] == x['B'] else 0, axis=1)
print (df)
   A  B  C1  C2  C3  C4  C5
0  1  1   1   1   1   1   1
1  1  1   1   1   1   1   1
2  1  1   1   1   1   1   1
3  1  1   1   1   1   1   1
4  1  1   1   1   1   1   1
5  2  1   0   0   0   0   0
6  2  0   0   0   0   0   0
7  2  0   0   0   0   0   0
8  2  0   0   0   0   0   0
9  2  0   0   0   0   0   0

这篇关于使用Pandas 2比较两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆