在大 pandas 的列中获取具有相同值的行 [英] Get rows that have the same value across its columns in pandas

查看:123
本文介绍了在大 pandas 的列中获取具有相同值的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在大熊猫中,给出一个DataFrame D:

  + ----- + -------- + -------- + -------- + 
| | 1 | 2 | 3 |
+ ----- + -------- + -------- + -------- +
| 0 |苹果|香蕉|香蕉|
| 1 |橙色|橙色|橙色|
| 2 |香蕉|苹果|橙色|
| 3 | NaN | NaN | NaN |
| 4 |苹果|苹果|苹果|
+ ----- + -------- + -------- + -------- +

如果有三列或更多列返回此列,那么如何返回其所有列中具有相同内容的行?

  + ----- + -------- + -------- + ------ -  + 
| | 1 | 2 | 3 |
+ ----- + -------- + -------- + -------- +
| 1 |橙色|橙色|橙色|
| 4 |苹果|苹果|苹果|
+ ----- + -------- + -------- + -------- +

注意,当所有值都是NaN时,它会跳过行。



如果这只是两列,我通常做 D [D [1] == D [2]] 但是我不知道如何推广超过2列DataFrames。

解决方案

类似于Andy Hayden回答,检查如果min等于max(然后行元素都是重复的):

  df [df.apply(lambda x:min(x)== max(x),1)] 


In pandas, given a DataFrame D:

+-----+--------+--------+--------+   
|     |    1   |    2   |    3   |
+-----+--------+--------+--------+
|  0  | apple  | banana | banana |
|  1  | orange | orange | orange |
|  2  | banana | apple  | orange |
|  3  | NaN    | NaN    | NaN    |
|  4  | apple  | apple  | apple  |
+-----+--------+--------+--------+

How do I return rows that have the same contents across all of its columns when there are three columns or more such that it returns this:

+-----+--------+--------+--------+   
|     |    1   |    2   |    3   |
+-----+--------+--------+--------+
|  1  | orange | orange | orange |
|  4  | apple  | apple  | apple  |
+-----+--------+--------+--------+

Note that it skips rows when all values are NaN.

If this were only two columns, I usually do D[D[1]==D[2]] but I don't know how to generalize this for more than 2 column DataFrames.

解决方案

Similar to Andy Hayden answer with check if min equal to max (then row elements are all duplicates):

df[df.apply(lambda x: min(x) == max(x), 1)]

这篇关于在大 pandas 的列中获取具有相同值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆