具有 NaN(缺失)值的 Pandas GroupBy 列 [英] pandas GroupBy columns with NaN (missing) values

查看：61 发布时间：2021/12/3 8:47:01 python pandas group-by pandas-groupby nan

本文介绍了具有 NaN(缺失)值的 Pandas GroupBy 列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 DataFrame 在我希望分组的列中有许多缺失值:

I have a DataFrame with many missing values in columns which I wish to groupby:

import pandas as pd
import numpy as np
df = pd.DataFrame({'a': ['1', '2', '3'], 'b': ['4', np.NaN, '6']})

In [4]: df.groupby('b').groups
Out[4]: {'4': [0], '6': [2]}

看到 Pandas 删除了具有 NaN 目标值的行.(我想包括这些行！)

see that Pandas has dropped the rows with NaN target values. (I want to include these rows!)

因为我需要很多这样的操作(很多列都有缺失值)，并且使用比中位数更复杂的函数(通常是随机森林)，我想避免编写太复杂的代码.

有什么建议吗?我应该为此编写一个函数还是有一个简单的解决方案?

Any suggestions? Should I write a function for this or is there a simple solution?

pandas >= 1.1

从 pandas 1.1 开始，您可以更好地控制这种行为，现在允许在石斑鱼中使用 NA 值，使用 dropna=False:

pandas >= 1.1

From pandas 1.1 you have better control over this behavior, NA values are now allowed in the grouper using dropna=False:

pd.__version__
# '1.1.0.dev0+2004.g8d10bfb6f'

# Example from the docs
df

   a    b  c
0  1  2.0  3
1  1  NaN  4
2  2  1.0  3
3  1  2.0  2

# without NA (the default)
df.groupby('b').sum()

     a  c
b        
1.0  2  3
2.0  2  5

# with NA
df.groupby('b', dropna=False).sum()

     a  c
b        
1.0  2  3
2.0  2  5
NaN  1  4

这篇关于具有 NaN(缺失)值的 Pandas GroupBy 列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

具有 NaN(缺失)值的 Pandas GroupBy 列 [英] pandas GroupBy columns with NaN (missing) values

问题描述

推荐答案

pandas >= 1.1

pandas >= 1.1

相关文章

Python最新文章

热门教程

热门工具

登录关闭

具有 NaN(缺失)值的 Pandas GroupBy 列 [英] pandas GroupBy columns with NaN (missing) values

问题描述

推荐答案

pandas >= 1.1

pandas >= 1.1

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭