pandas :如何遍历和删除一列只有一个条目的行 [英] Pandas: How do I loop through and remove rows where a column has a single entry

查看:100
本文介绍了 pandas :如何遍历和删除一列只有一个条目的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个带有若干列的pandas数据框(下面是一个简单数据框的代码,但实际数据框有100列以上):

So I have a pandas dataframe with some number of columns (Below is the code for a simple dataframe but the real dataframe has over 100 columns):

X = pd.DataFrame([["A","Z"],["A","Z"],["B","Z"]],columns=["COL1","COL2"])

我想做的是遍历每一列并删除仅发生某事的行.例如,在第1列中,我们只有一个'B'实例,因此我想以此为基础删除该行.

What I want to do is to go through every column and remove rows where only a single instance of something happens. E.g., in column 1 we only have one instance of 'B' so I'd like to delete the row on that basis.

但是,我想遍历数据帧的每一列,并继续删除有此类实例的行.

However, I would like to go through every single column of the dataframe and keep removing rows where I have such instances.

当我指定名称时,以下代码适用:

The following code works for when I specify the name:

X = X[X.groupby('COL1').COL1.transform(len) > 1]

但是,我不确定如何遍历这段代码.显然,我可以手动输入每个列名称,但是我想更正确地执行此操作.显然,以下操作无效:

However, I'm not sure how to loop through with this code. Obviously I can manually enter every column name but I wanted to do this more properly. Obviously the following doesn't work:

for column in X:
    X[X.groupby(column).column.transform(len)>1]

    COL1    COL2
0   A   Z
1   A   Z
2   B   Y

对不起,是否已询问.我发现了很多类似的问题,但是没有找到我们没有手动指定列的地方.

Sorry if this has been asked. I found a lot of similar questions but haven't found one where we aren't manually specifying the column.

先谢谢您!请让我知道是否需要其他信息

Thank you in advance! Please let me know if you need additional information

推荐答案

您可以按以下方式使用重复项:

You can use duplicated as so:

X = pd.DataFrame([["A","Z"],["A","Z"],["B","Z"], ["A","Y"]],columns=["COL1","COL2"])

for column in X:
    X = X[X[column].duplicated(keep=False)]

输出:

  COL1 COL2
0    A    Z
1    A    Z

这篇关于 pandas :如何遍历和删除一列只有一个条目的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆