在pandas DataFrame中查找重复行的索引 [英] Find indices of duplicate rows in pandas DataFrame

查看：871 发布时间：2020/5/24 0:14:33 python pandas dataframe

本文介绍了在pandas DataFrame中查找重复行的索引的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在给定DataFrame中查找相同行的索引而不迭代单个行的熊猫方式是什么?

What is the pandas way of finding the indices of identical rows within a given DataFrame without iterating over individual rows?

虽然可以用unique = df[df.duplicated()]查找所有唯一的行，然后用unique.iterrows()遍历唯一的条目，并借助pd.where()提取相等条目的索引，但是熊猫的作法是什么?

While it is possible to find all unique rows with unique = df[df.duplicated()] and then iterating over the unique entries with unique.iterrows() and extracting the indices of equal entries with help of pd.where(), what is the pandas way of doing it?

示例: 给定具有以下结构的DataFrame:

Example: Given a DataFrame of the following structure:

  | param_a | param_b | param_c
1 | 0       | 0       | 0
2 | 0       | 2       | 1
3 | 2       | 1       | 1
4 | 0       | 2       | 1
5 | 2       | 1       | 1
6 | 0       | 0       | 0

输出:

[(1, 6), (2, 4), (3, 5)]

推荐答案

使用参数 duplicated ，对所有重复行使用keep=False，然后对所有列进行groupby并将索引值转换为元组，最后将输出Series转换为list:

Use parameter duplicated with keep=False for all dupe rows and then groupby by all columns and convert index values to tuples, last convert output Series to list:

df = df[df.duplicated(keep=False)]

df = df.groupby(list(df).apply(lambda x: tuple(x.index)).tolist()
print (df)
[(1, 6), (2, 4), (3, 5)]

如果您还希望看到重复值:

If you want also see dupe values:

df1 = (df.groupby(df.columns.tolist())
       .apply(lambda x: tuple(x.index))
       .reset_index(name='idx'))
print (df1)
   param_a  param_b  param_c     idx
0        0        0        0  (1, 6)
1        0        2        1  (2, 4)
2        2        1        1  (3, 5)

这篇关于在pandas DataFrame中查找重复行的索引的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在pandas DataFrame中查找重复行的索引 [英] Find indices of duplicate rows in pandas DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在pandas DataFrame中查找重复行的索引 [英] Find indices of duplicate rows in pandas DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭