groupby.first()和groupby.head(1)有什么区别? [英] What's the difference between groupby.first() and groupby.head(1)?

查看:609
本文介绍了groupby.first()和groupby.head(1)有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

均返回每个组第一行的DataFrame。在阅读API参考时,它首先说计算第一组值,但是当同时查看两个输出时,我看不出有什么大的不同。

Both return a DataFrame of the first row of each group. When reading the API reference it says first "computes first group of values" but when looking at both outputs side by side I don't see a major difference.

Am I遗漏了什么?

Am I missing something?

df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
                    'value'  : ["first","second","second","first",
                                "second","first","third","fourth",
                                "fifth","second","fifth","first",
                                "first","second","third","fourth","fifth"]})

第一个API

推荐答案

主要区别是 first()将跳到第一个非空值,而 head(1)不会。

The main difference is that first() will skip to the first non-null value, while head(1) won't.

如果我放弃 np.nan 放入您的示例:

If I drop np.nan into your example:

df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
                   'value'  : [np.nan,"second","second","first",
                               "second","first","third","fourth",
                               "fifth","second","fifth","first",
                               "first","second","third","fourth","fifth"]})

然后我们有:

>>> df.groupby('id').head(1)
    id   value
0    1     NaN      # NaN is included
3    2   first
5    3   first
9    4  second
11   5   first
12   6   first
15   7  fourth

>>> df.groupby('id').first()
     value
id        
1   second          # NaN is skipped
2    first
3    first
4   second
5    first
6    first
7   fourth

(此外,如您所见, head()会重置索引。)

(Also, as you see, head() resets the index.)

这篇关于groupby.first()和groupby.head(1)有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆