在 pandas 数据框中查找具有相同列值的行 [英] Finding rows with same column values in pandas dataframe
问题描述
我有两个列大小不同的数据框,其中四个列在两个数据框中可以具有相同的值.我想在 df1 中创建一个新列,如果 df2 中有一行与 df1 中的一行具有相同的列A"、B"、C"和D"的值,则该列取值为 1.如果没有这样的行,我希望值为 0.行 'E' 和 'F' 对于检查值并不重要.
I have two dataframes with different column size, where four columns can have the same values in both dataframes. I want to make a new column in df1, that takes the value 1 if there is a row in df2 that has the same values for column 'A','B','C', and 'D' as a row in df1. If there isn't such a row, I want the value to be 0. Rows 'E' and 'F' are not important for checking the values.
是否有 Pandas 函数可以做到这一点,或者我必须在循环中做到这一点.
Is there a pandas function that can do this, or do I have to this in a loop.
例如:
df1 =
A B C D E F
1 1 20 20 3 2
1 1 12 14 1 3
2 1 13 43 4 3
2 2 12 34 1 4
df2 =
A B C D E
1 3 12 14 2
1 1 20 20 4
2 2 21 31 5
2 2 12 34 8
预期输出:
df1 =
A B C D E F Target
1 1 20 20 3 2 1
1 1 12 14 1 3 0
2 1 13 43 4 3 0
2 2 12 34 1 4 1
推荐答案
这相当简单.如果你检查两个 DataFrame 是否相等,它会检查每个元素是否等于各自的元素.
This is fairly simple. If you check whether two DataFrames are equal, it checks if each element is equal to the respective element.
col_list = ['A', 'B', 'C', 'D']
idx = (df1.loc[:, col_list] == df2.loc[:, col_list]).all(axis=1)
df1['new_row'] = idx.astype(int)
这篇关于在 pandas 数据框中查找具有相同列值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!