如何将 groupby.first() 与转换函数一起使用 [英] How to use groupby.first() with transform function

查看:83
本文介绍了如何将 groupby.first() 与转换函数一起使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 groupby.first() 函数来查找组的第一个非空值并将该值转换为组中的每一行.

我尝试了以下代码:

将pandas导入为pd将 numpy 导入为 npraw_data = {'col1': ['a','a','a','b','b','b','b','b','b','c','c','c','c','c'],'col2': [np.nan,np.nan,6,0,2,0,8,2,2,3,0,0,4,5]}df=pd.DataFrame(raw_data)df['col3'] = df.groupby('col1')['col2'].transform(lambda x: x.first())df

我想要一个看起来像这样的 df:

 col1 col2 col3一个 NaN 6一个 NaN 6一个 6 60 02 00 08 02 02 03 30 30 34 35 3

我收到以下错误:类型错误:first() 缺少 1 个必需的位置参数:'offset'

有趣的是,如果我运行相同的代码并且只是将 first() 换成 sum() 然后它返回该组每一行的每个组的总和.first() 函数将不起作用.为什么不?任何帮助将不胜感激!

解决方案

在您的 lambda 中,您正在尝试使用 Series.first,这仅对具有 DatetimeIndex 的系列有意义.>

你想要GroupBy.first,可以使用命名别名'first'访问.

df['col3'] = df.groupby('col1')['col2'].transform('first')

I would like to use the groupby.first() function to find the first non-null value of a group and transform that value to each row in the group.

I have tried the following code:

import pandas as pd
import numpy as np
raw_data = {'col1': ['a','a','a','b','b','b','b','b','b','c','c','c','c','c'],
            'col2': [np.nan,np.nan,6,0,2,0,8,2,2,3,0,0,4,5]}
df=pd.DataFrame(raw_data)
df['col3'] = df.groupby('col1')['col2'].transform(lambda x: x.first())
df

I would like to get a df that looks like this:

  col1 col2 col3
    a NaN   6
    a NaN   6
    a 6     6
    b 0     0
    b 2     0
    b 0     0
    b 8     0
    b 2     0
    b 2     0
    c 3     3
    c 0     3
    c 0     3
    c 4     3
    c 5     3

I get the following error: TypeError: first() missing 1 required positional argument: 'offset'

Interestingly, if I run the same code and just swap out first() for sum() then it returns the sum of each group for every row of that group. The first() function will not work. Why not? Any help would be greatly appreciated!

解决方案

With your lambda you are trying to use Series.first, which only makes sense for a Series with a DatetimeIndex.

You want GroupBy.first, which can be accessed with the named alias 'first'.

df['col3'] = df.groupby('col1')['col2'].transform('first')

这篇关于如何将 groupby.first() 与转换函数一起使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆