根据条件删除 pandas 数据框中的重复行 [英] Remove duplicate rows in pandas dataframe based on condition

查看：87 发布时间：2020/5/24 2:14:14 python pandas

本文介绍了根据条件删除 pandas 数据框中的重复行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

            is_avail   valu data_source
2015-08-07     False  0.282    source_a
2015-08-07     False  0.582    source_b
2015-08-23     False  0.296    source_a
2015-09-08     False  0.433    source_a
2015-10-01      True  0.169    source_b

在上面的数据框中，我想通过在valu列中保留较高值的行来删除重复的行(即重复索引的行).

In the dataframe above, I want to remove the duplicate rows (i.e. row where the index is repeated) by retaining the row with a higher value in the valu column.

我可以删除具有重复索引的行，如下所示:

I can remove rows with duplicate indexes like this:

df = df[~df.index.duplicated()].但是如何根据上面指定的条件删除?

df = df[~df.index.duplicated()]. But how to remove based on condition specified above?

推荐答案

按值对df排序后，就可以在索引上使用groupby.

You can use groupby on index after sorting the df by valu.

df.sort_values(by='valu', ascending=False).groupby(level=0).first()
Out[1277]: 
           is_avail   valu data_source
2015-08-07    False  0.582    source_b
2015-08-23    False  0.296    source_a
2015-09-08    False  0.433    source_a
2015-10-01     True  0.169    source_b

这篇关于根据条件删除 pandas 数据框中的重复行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据条件删除 pandas 数据框中的重复行 [英] Remove duplicate rows in pandas dataframe based on condition

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

根据条件删除 pandas 数据框中的重复行 [英] Remove duplicate rows in pandas dataframe based on condition

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭