从整个数据框中删除重复的值 [英] Remove duplicate values from entire dataframe

查看：71 发布时间：2020/5/24 2:23:08 python pandas

本文介绍了从整个数据框中删除重复的值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个如下的Pandas DataFrame；

I have a Pandas DataFrame as follows;

data = pd.DataFrame({'A':[1,2,3,1,23,3,76,2,45,76],'B':[12,56,22,45,1,3,98,79,77,67]})

要从数据框中删除重复的值，我已经这样做了；

To remove duplicate values from the dataframe I have done this;

set(data['A'].unique()).union(set(data['B'].unique()))

结果；

set([1, 2, 3, 12, 76, 77, 79, 67, 22, 23, 98, 45, 56])

是否有更好的方法?有没有一种方法可以通过使用drop_duplicates来实现?

Is there a better way of doing this? Is there a way of achieving this by using drop_duplicates?

另外，如果我还有两列'C'& ;，该怎么办? 'D'，但只需要从'A'和&中删除重复项'B'?

also, What if I had two more columns 'C' & 'D' but need to drop duplicates only from 'A' & 'B' ?

推荐答案

如果您打算折叠

In [10]: np.unique(data.values.ravel())
Out[10]: array([ 1,  2,  3, 12, 22, 23, 45, 56, 67, 76, 77, 79, 98])

这也可以工作

In [12]: data.unstack().drop_duplicates()
Out[12]: 
A  0     1
   1     2
   2     3
   4    23
   6    76
   8    45
B  0    12
   1    56
   2    22
   6    98
   7    79
   8    77
   9    67
dtype: int64

这篇关于从整个数据框中删除重复的值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从整个数据框中删除重复的值 [英] Remove duplicate values from entire dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从整个数据框中删除重复的值 [英] Remove duplicate values from entire dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭