pandas 发现交叉值重复 [英] Pandas find Duplicates in cross values

查看：42 发布时间：2020/5/24 1:01:23 python pandas duplicates

本文介绍了 pandas 发现交叉值重复的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，想要消除重复的行，这些行具有相同的值，但在不同的列中:

I have a dataframe and want to eliminate duplicate rows, that have same values, but in different columns:

df = pd.DataFrame(columns=['a','b','c','d'], index=['1','2','3'])
df.loc['1'] = pd.Series({'a':'x','b':'y','c':'e','d':'f'})
df.loc['2'] = pd.Series({'a':'e','b':'f','c':'x','d':'y'})
df.loc['3'] = pd.Series({'a':'w','b':'v','c':'s','d':'t'})

df
Out[8]: 
   a  b  c  d
1  x  y  e  f
2  e  f  x  y
3  w  v  s  t

行[1]，[2]的值为{x，y，e，f}，但它们的排列形式为交叉-即，如果要在行[2]中将c，d列与a，b交换，您将有一个副本. 我要删除这些行，仅保留其中一行，以得到最终输出:

Rows [1],[2] have the values {x,y,e,f}, but they are arranged in a cross - i.e. if you would exchange columns c,d with a,b in row [2] you would have a duplicate. I want to drop these lines and only keep one, to have the final output:

df_new
Out[20]: 
   a  b  c  d
1  x  y  e  f
3  w  v  s  t

我如何有效地做到这一点?

How can I efficiently achieve that?

推荐答案

我认为您需要通过 numpy.sort 与

I think you need filter by boolean indexing with mask created by numpy.sort with duplicated, for invert it use ~:

df = df[~pd.DataFrame(np.sort(df, axis=1), index=df.index).duplicated()]
print (df)
   a  b  c  d
1  x  y  e  f
3  w  v  s  t

详细信息:

print (np.sort(df, axis=1))
[['e' 'f' 'x' 'y']
 ['e' 'f' 'x' 'y']
 ['s' 't' 'v' 'w']]

print (pd.DataFrame(np.sort(df, axis=1), index=df.index))
   0  1  2  3
1  e  f  x  y
2  e  f  x  y
3  s  t  v  w

print (pd.DataFrame(np.sort(df, axis=1), index=df.index).duplicated())
1    False
2     True
3    False
dtype: bool

print (~pd.DataFrame(np.sort(df, axis=1), index=df.index).duplicated())

1     True
2    False
3     True
dtype: bool

这篇关于 pandas 发现交叉值重复的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 发现交叉值重复 [英] Pandas find Duplicates in cross values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 发现交叉值重复 [英] Pandas find Duplicates in cross values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭