pandas :聚合具有多个功能的多个列 [英] Pandas: aggregating multiple columns with multiple functions
问题描述
custom_func< - function(col1,col2)length(col1)+ length(col2)
ChickWeight%>%
group_by(Diet)%>%
summaryize(m_weight = mean(weight),
var_time = var(Time),
covar = cov重量,时间),
odd_stat = custom_func(weight,Time))
一个声明
- 我可以在一行中聚合多列。
- 我可以对这些列应用不同的功能一行中的多个列。
- 我可以使用考虑两列的函数。
- 我可以为其中的任何一个引入自定义函数。
- 我可以为这些聚合声明新的列名。
这样的模式也是可能的在大熊猫?请注意,我有兴趣在一个简短的声明中(因此不要创建三个不同的数据框,然后加入它们)。
编辑
我注意到这个问题被下载了。如果有人可以提到为什么这个职位被撤职,我可能有机会改进这个问题。
使用大熊猫 groupby.apply(),您可以在groupby聚合中运行多个功能。请注意,您需要安装 scipy
的统计功能。对于自定义函数,需要像 sum()
一样运行集合数据:
def customfct(x,y):
data = x / y
return data.mean()
def f(row):
row ['m_weight'] = row ['weight']。mean()
row ['var_time'] = row ['Time']。var()
row ['cov'] = row [ 'weight']。cov(row ['Time'])
row ['odd_stat'] = customfct(row ['weight'],row ['Time'])
return row
aggdf = df.groupby('Diet')。apply(f)
Pandas in Python and Dplyr in R are both flexible data wrangling tools. For example, in R, with dplyr one can do the following;
custom_func <- function(col1, col2) length(col1) + length(col2)
ChickWeight %>%
group_by(Diet) %>%
summarise(m_weight = mean(weight),
var_time = var(Time),
covar = cov(weight, Time),
odd_stat = custom_func(weight, Time))
Notice how in one statement;
- I can aggregate over multiple columns in one line.
- I can apply different functions over these multiple columns in one line.
- I can use functions that take into account two columns.
- I can throw in custom functions for any of these.
- I can declare new column names for these aggregations.
Is such a pattern also possible in pandas? Note that I am interested in doing this in a short statement (so not creating three different dataframes and then joining them).
EditI've noticed the question got downvoted. If somebody could mention why the post was downvoted I might have the opportunity to improve the question.
With pandas groupby.apply() you can run multiple functions in a groupby aggregation. Please note for statistical functions you would need scipy
installed. For custom functions will need to run an aggregate like sum()
for groupwise data:
def customfct(x,y):
data = x / y
return data.mean()
def f(row):
row['m_weight'] = row['weight'].mean()
row['var_time'] = row['Time'].var()
row['cov'] = row['weight'].cov(row['Time'])
row['odd_stat'] = customfct(row['weight'], row['Time'])
return row
aggdf = df.groupby('Diet').apply(f)
这篇关于 pandas :聚合具有多个功能的多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!