如何在 pandas 数据框上的分组中删除NaN元素? [英] How to drop NaN elements in a groupby on a pandas dataframe?
本文介绍了如何在 pandas 数据框上的分组中删除NaN元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有这个数据框:
my_df = pd.DataFrame({'A':[np.nan,np.nan,'gate','ball'],'B':['car',np.nan,np.nan,np.nan],'C':[np.nan,'edge',np.nan,np.nan],'D':['id1','id1','id1','id2']})
In [176]: my_df
Out[176]:
A B C D
0 NaN car NaN id1
1 NaN NaN edge id1
2 gate NaN NaN id1
3 ball NaN NaN id2
我想按"D"列分组并忽略NaN.预期输出:
I want to group by column "D" and to ignore the NaN. Expected output :
A B C
D
id1 gate car edge
id2 ball NaN NaN
我的解决方案是用空字符填充NaN并采用最大值:
My solution would be to fill NaN with empty char and to take the max:
In [177]: my_df.fillna("").groupby("D").max()
Out[177]:
A B C
D
id1 gate car edge
id2 ball
还有没有fillna(")的其他解决方案吗?
Is there another solution without fillna("") ?
推荐答案
对dropna
使用自定义函数,但对于空值,请添加NaN
s:
Use custom function with dropna
, but for empty values add NaN
s:
print (my_df.groupby("D").agg(lambda x: np.nan if x.isnull().all() else x.dropna()))
A B C
D
id1 gate car edge
id2 ball NaN NaN
具有自定义功能的类似解决方案:
Similar solution with custom function:
def f(x):
y = x.dropna()
return np.nan if y.empty else y
print (my_df.groupby("D").agg(f))
A B C
D
id1 gate car edge
id2 ball NaN NaN
这篇关于如何在 pandas 数据框上的分组中删除NaN元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文