如何删除重复项并在 pandas 上保持第一价值? [英] How do I drop duplicates and keep the first value on pandas?
本文介绍了如何删除重复项并在 pandas 上保持第一价值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想删除重复项并保留第一个值.要删除的重复项是A ='df'.这是我的数据
I want to drop duplicates and keep the first value. The duplicates that want to be dropped is A = 'df' .Here's my data
A B C D E
qw 1 3 1 1
er 2 4 2 6
ew 4 8 44 4
df 34 34 34 34
df 2 5 2 2
df 3 3 7 3
df 4 4 7 4
we 2 5 5 2
we 4 4 4 4
df 34 9 34 34
df 3 3 9 3
we 4 7 4 4
qw 2 2 7 2
结果将是
A B C D E
qw 1 3 1 1
er 2 4 2 6
ew 4 8 44 4
**df** 34 34 34 34
we 2 5 5 2
we 4 4 4 4
**df** 34 9 34 34
we 4 7 4 4
qw 2 2 7 2
推荐答案
创建帮助程序Series
以便区分A
列中的连续值,然后按(~)布尔掩码. org/pandas-docs/stable/generation/pandas.Series.duplicated.html"rel =" nofollow noreferrer> duplicated
与另一个掩码链接在一起以比较值df
:
Create helper Series
for distinguish consecutive values in A
column and then filter by boolean indexing
with inverted (~)
boolean mask created by duplicated
chained with another mask for compare value df
:
s = df['A'].ne(df['A'].shift()).cumsum()
df = df[~((df['A'] == 'df') & (s.duplicated()))]
print (df)
A B C D E
0 qw 1 3 1 1
1 er 2 4 2 6
2 ew 4 8 44 4
3 df 34 34 34 34
7 we 2 5 5 2
8 we 4 4 4 4
9 df 34 9 34 34
11 we 4 7 4 4
12 qw 2 2 7 2
这篇关于如何删除重复项并在 pandas 上保持第一价值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文