为 pandas 设置差异 [英] set difference for pandas

查看：59 发布时间：2020/5/23 21:25:11 python pandas dataframe

本文介绍了为 pandas 设置差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

一个简单的熊猫问题:

是否有drop_duplicates()功能可以删除重复中涉及的每一行?

Is there a drop_duplicates() functionality to drop every row involved in the duplication?

以下是一个等效的问题:熊猫在数据帧方面是否有固定的差异?

An equivalent question is the following: Does pandas have a set difference for dataframes?

例如:

In [5]: df1 = pd.DataFrame({'col1':[1,2,3], 'col2':[2,3,4]})

In [6]: df2 = pd.DataFrame({'col1':[4,2,5], 'col2':[6,3,5]})

In [7]: df1
Out[7]: 
   col1  col2
0     1     2
1     2     3
2     3     4

In [8]: df2
Out[8]: 
   col1  col2
0     4     6
1     2     3
2     5     5

所以也许像df2.set_diff(df1)这样的东西会产生这种情况:

so maybe something like df2.set_diff(df1) will produce this:

   col1  col2
0     4     6
2     5     5

但是，我不想依赖索引，因为在我的情况下，我必须处理具有不同索引的数据框.

However, I don't want to rely on indexes because in my case, I have to deal with dataframes that have distinct indexes.

顺便说一句，我最初考虑了当前drop_duplicates()方法的扩展，但是现在我意识到，使用集合论属性的第二种方法通常更有用.但这两种方法都能解决我当前的问题.

By the way, I initially thought about an extension of the current drop_duplicates() method, but now I realize that the second approach using properties of set theory would be far more useful in general. Both approaches solve my current problem, though.

谢谢！

为 pandas 设置差异 [英] set difference for pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

为 pandas 设置差异 [英] set difference for pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭