大 pandas 丢弃重复;值相反的顺序 [英] Pandas drop duplicates; values in reverse order

查看：123 发布时间：2017/7/21 1:54:08 python-2.7 pandas duplicates

本文介绍了大 pandas 丢弃重复;值相反的顺序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图找到一种方法来利用熊猫 drop_duplicates（）来识别当这些值是相反的顺序时，这些行是重复的。

I'm trying to find a way to utilize pandas drop_duplicates() to recognize that rows are duplicates when the values are in reverse order.

一个例子是如果我试图找到客户购买苹果和香蕉的交易，但数据收集顺序可能已经扭转了这些项目。换句话说，当作为完整订单组合时，交易被视为重复，因为它由相同的项目组成。

An example is if I am trying to find transactions where customers purchases both apples and bananas, but the data collection order may have reversed the items. In other words, when combined as a full order the transaction is seen as a duplicate because it is made up up of the same items.

我想要将以下内容识别为重复：

I want the following to be recognized as duplicates:

Item1   Item2
Apple   Banana
Banana  Apple

推荐答案

p>首先按行排列 应用 排序然后 drop_duplicates ：

First sort by rows with apply sorted and then drop_duplicates:

df = df.apply(sorted, axis=1).drop_duplicates()
print (df)
   Item1   Item2
0  Apple  Banana

#if need specify columns
cols = ['Item1','Item2']
df[cols] = df[cols].apply(sorted, axis=1)
df = df.drop_duplicates(subset=cols)
print (df)
   Item1   Item2
0  Apple  Banana

另一个解决方案是 numpy.sort 和 DataFrame 构造函数：

Another solution with numpy.sort and DataFrame constructor:

df = pd.DataFrame(np.sort(df.values, axis=1), index=df.index, columns=df.columns)
       .drop_duplicates()
print (df)
   Item1   Item2
0  Apple  Banana

这篇关于大 pandas 丢弃重复;值相反的顺序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

大 pandas 丢弃重复;值相反的顺序 [英] Pandas drop duplicates; values in reverse order

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

大 pandas 丢弃重复;值相反的顺序 [英] Pandas drop duplicates; values in reverse order

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭