如何在 pandas 数据框上的分组中删除NaN元素? [英] How to drop NaN elements in a groupby on a pandas dataframe?

查看:51
本文介绍了如何在 pandas 数据框上的分组中删除NaN元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这个数据框:

my_df = pd.DataFrame({'A':[np.nan,np.nan,'gate','ball'],'B':['car',np.nan,np.nan,np.nan],'C':[np.nan,'edge',np.nan,np.nan],'D':['id1','id1','id1','id2']})

In [176]: my_df
Out[176]:
  A    B     C    D
0   NaN  car   NaN  id1
1   NaN  NaN  edge  id1
2  gate  NaN   NaN  id1
3  ball  NaN   NaN  id2

我想按"D"列分组并忽略NaN.预期输出:

I want to group by column "D" and to ignore the NaN. Expected output :

        A    B     C
D
id1  gate  car  edge
id2  ball  NaN  NaN

我的解决方案是用空字符填充NaN并采用最大值:

My solution would be to fill NaN with empty char and to take the max:

In [177]: my_df.fillna("").groupby("D").max()
Out[177]:
    A    B     C
D
id1  gate  car  edge
id2  ball

还有没有fillna(")的其他解决方案吗?

Is there another solution without fillna("") ?

推荐答案

dropna使用自定义函数,但对于空值,请添加NaN s:

Use custom function with dropna, but for empty values add NaNs:

print (my_df.groupby("D").agg(lambda x: np.nan if x.isnull().all() else x.dropna()))
        A    B     C
D                   
id1  gate  car  edge
id2  ball  NaN   NaN

具有自定义功能的类似解决方案:

Similar solution with custom function:

def f(x):
    y = x.dropna()
    return np.nan if y.empty else y

print (my_df.groupby("D").agg(f))
        A    B     C
D                   
id1  gate  car  edge
id2  ball  NaN   NaN

这篇关于如何在 pandas 数据框上的分组中删除NaN元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆