groupby.first()和groupby.head(1)有什么区别? [英] What's the difference between groupby.first() and groupby.head(1)?
本文介绍了groupby.first()和groupby.head(1)有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
均返回每个组第一行的DataFrame。在阅读API参考时,它首先说计算第一组值,但是当同时查看两个输出时,我看不出有什么大的不同。
Both return a DataFrame of the first row of each group. When reading the API reference it says first "computes first group of values" but when looking at both outputs side by side I don't see a major difference.
Am I遗漏了什么?
Am I missing something?
df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
'value' : ["first","second","second","first",
"second","first","third","fourth",
"fifth","second","fifth","first",
"first","second","third","fourth","fifth"]})
推荐答案
主要区别是 first()
将跳到第一个非空值,而 head(1)
不会。
The main difference is that first()
will skip to the first non-null value, while head(1)
won't.
如果我放弃 np.nan
放入您的示例:
If I drop np.nan
into your example:
df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
'value' : [np.nan,"second","second","first",
"second","first","third","fourth",
"fifth","second","fifth","first",
"first","second","third","fourth","fifth"]})
然后我们有:
>>> df.groupby('id').head(1)
id value
0 1 NaN # NaN is included
3 2 first
5 3 first
9 4 second
11 5 first
12 6 first
15 7 fourth
>>> df.groupby('id').first()
value
id
1 second # NaN is skipped
2 first
3 first
4 second
5 first
6 first
7 fourth
(此外,如您所见, head()
会重置索引。)
(Also, as you see, head()
resets the index.)
这篇关于groupby.first()和groupby.head(1)有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文