删除一行 pandas 数据框中的重复值 [英] delete duplicate values in a row of pandas dataframe

查看：61 发布时间：2020/5/24 1:06:59 python pandas numpy

本文介绍了删除一行 pandas 数据框中的重复值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个熊猫数据框:

>>df_freq = pd.DataFrame([["Z11", "Z11", "X11"], ["Y11","",""], ["Z11","Z11",""]], columns=list('ABC'))

>>df_freq
    A   B   C
0   Z11 Z11 X11
1   Y11     
2   Z11 Z11

我想确保每一行都只有唯一的值.因此它应该变成这样:删除的值可以替换为零或空

I want to make sure each row has unique values only. Therefore it should become like this: Removed values can be replaced with zero or empty

    A   B   C
0   Z11 0   X11
1   Y11     
2   Z11 0

我的数据框很大，有数百列和数千行.目的是计算该数据帧中的唯一值.通过使用将数据帧转换为矩阵并应用

My data frame is big with hundreds of columns and thousands of rows. The goal is to count the unique values in that data frame. I do that by using converting data frame to matrix and applying

>>np.unique(mat.astype(str), return_counts=True)

但是在某些行中会出现相同的值，因此我想在应用np.unique()方法之前将其删除.我想在每一行中保留唯一的值.

But in certain row(s) the same value occurs and I want to remove that before applying np.unique() method. I want to keep unique values in each row.

推荐答案

结合使用astype(bool)和duplicated

mask = df_freq.apply(pd.Series.duplicated, 1) & df_freq.astype(bool)

df_freq.mask(mask, 0)

     A  B    C
0  Z11  0  X11
1  Y11        
2  Z11  0

这篇关于删除一行 pandas 数据框中的重复值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

删除一行 pandas 数据框中的重复值 [英] delete duplicate values in a row of pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

删除一行 pandas 数据框中的重复值 [英] delete duplicate values in a row of pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭