使用singe pandas groupby命令将不同的函数应用于不同的列 [英] Apply different functions to different columns with a singe pandas groupby command
问题描述
我的数据存储在 df
中.每个组
我有多个 users
.我想按 group
将 df
分组,并将不同的功能应用于不同的列.所不同的是,我想在此过程中为新列分配自定义名称.
My data is stored in df
. I have multiple users
per group
. I want to group df
by group
and apply different functions to different columns. The twist is that I would like to assign custom names to the new columns during this process.
np.random.seed(123)
df = pd.DataFrame({"user":range(4),"group":[1,1,2,2],"crop":["2018-01-01","2018-01-01","2018-03-01","2018-03-01"],
"score":np.random.randint(400,1000,4)})
df["crop"] = pd.to_datetime(df["crop"])
print(df)
user group crop score
0 0 1 2018-01-01 910
1 1 1 2018-01-01 765
2 2 2 2018-03-01 782
3 3 2 2018-03-01 722
我想获取得分
的平均值,以及 group
和分组的 crop
的最小值和最大值strong>为每个新列分配自定义名称.所需的输出应如下所示:
I want to get the mean of score
, and the min and max values of crop
grouped by group
and assign custom names to each new column. The desired output should look like this:
group mean_score min_crop max_crop
0 1 837.5 2018-01-01 2018-01-01
1 2 752.0 2018-03-01 2018-03-01
我不知道如何在Python的单行代码中执行此操作.在R中,我将使用 data.table
并获得以下信息:
I don't know how to do this in a one-liner in Python. In R, I would use data.table
and get the following:
df[, list(mean_score = mean(score),
max_crop = max(crop),
min_crop = min(crop)), by = group]
我知道我可以对数据进行分组,然后将 .agg
与字典结合使用.有没有其他方法可以在此过程中自定义每个列的名称?
I know I could group the data and use .agg
combined with a dictionary. Is there an alternative way where I can custom-name each column in this process?
推荐答案
尝试使用 groupby().apply()
创建具有所需操作的函数:
Try creating a function with the required operations using groupby().apply()
:
def f(x):
d = {}
d['mean_score'] = x['score'].mean()
d['min_crop'] = x['crop'].min()
d['max_crop'] = x['crop'].max()
return pd.Series(d, index=['mean_score', 'min_crop', 'max_crop'])
data = df.groupby('group').apply(f)
这篇关于使用singe pandas groupby命令将不同的函数应用于不同的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!