根据不同的组合在列列表上应用函数列表 [英] Apply List of functions on List of columns based on different combinations

查看:21
本文介绍了根据不同的组合在列列表上应用函数列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框 df,其中包含三个分类变量 cat1cat2cat3 和两个连续变量 <代码>con1,con2.我想根据不同的组合在 con1,con2 列列表上计算函数列表 sd,meancat1,cat2,cat3 列的列表.我已经明确地对所有不同的组合进行了子集化.

I have a dataframe df with three categorical variables cat1,cat2,cat3 and two continuous variables con1,con2. I would like to compute list of functions sd,mean on list of columns con1,con2 based on different combinations of list of columns cat1,cat2,cat3. I have done them explicitly subsetting all different combinations.

# Random generation of values for categorical data
set.seed(33)
df <- data.frame(cat1 = sample( LETTERS[1:2], 100, replace=TRUE ), 
                cat2 = sample( LETTERS[3:5], 100, replace=TRUE ),
                cat3 = sample( LETTERS[2:4], 100, replace=TRUE ),
                con1 = runif(100,0,100),
                con2 = runif(100,23,45))

# Introducing null values 
df$con1[c(23,53,92)] <- NA
df$con2[c(33,46)] <- NA

results <- data.frame()
funs <- list(sd=sd, mean=mean)

# calculation of mean and sd on total observations
sapply(funs, function(x) sapply(df[,c(4,5)], x, na.rm=T))

# calculation of mean and sd on different levels of cat1 
sapply(funs, function(x) sapply(df[df$cat1=='A',c(4,5)], x, na.rm=T))
sapply(funs, function(x) sapply(df[df$cat1=='B',c(4,5)], x, na.rm=T))

# calculation of mean and sd on different levels of cat1 and cat2
sapply(funs, function(x) sapply(df[df$cat1=='A' & df$cat2=='C' ,c(4,5)], x, na.rm=T))
.
.
.
sapply(funs, function(x) sapply(df[df$cat1=='B' & df$cat2=='E' ,c(4,5)], x, na.rm=T))

# Similarly for the combinations of three cat variables cat1, cat2, cat3

我想写一个函数,根据不同的组合动态计算列列表的函数列表.你能不能给一些建议.谢谢!

I would like to write a function on dynamically computing the list of functions for list of columns based on different combinations. Could you please give some suggestions. Thanks !

编辑:我已经使用 dplyr 得到了一些明智的建议.如果有人使用 apply 系列函数提供建议,那将会很棒,因为这将有助于在进一步的要求中使用它们(数据帧).

Edit: I have already got some smart suggestions using dplyr. It would be great if someone provides suggestions using the apply family functions as it will help in using them(dataframes) in the further requirements.

推荐答案

这是一个简单的单行基础解决方案:

This is a simple one-line base solution:

> do.call(cbind, lapply(funs, function(x) aggregate(cbind(con1, con2) ~ cat1 + cat2 + cat3, data = df, FUN = x, na.rm = TRUE)))
   sd.cat1 sd.cat2 sd.cat3  sd.con1   sd.con2 mean.cat1 mean.cat2 mean.cat3 mean.con1 mean.con2
1        A       C       B       NA        NA         A         C         B  25.52641  37.40603
2        B       C       B 32.67192  6.966547         B         C         B  46.70387  34.85437
3        A       D       B 31.05224  6.530313         A         D         B  37.91553  37.13142
4        B       D       B 23.80335  6.001468         B         D         B  59.75107  30.29681
5        A       E       B 22.79285  1.526472         A         E         B  38.54742  25.23007
6        B       E       B 32.92139  2.621067         B         E         B  51.56253  29.52367
7        A       C       C 26.98661  5.710335         A         C         C  36.32045  36.42465
8        B       C       C 20.22217  8.117184         B         C         C  60.60036  34.98460
9        A       D       C 33.39273  7.367412         A         D         C  40.77786  35.03747
10       B       D       C 12.95351  8.829061         B         D         C  49.77160  33.21836
11       A       E       C 33.73433  4.689548         A         E         C  55.53135  32.38279
12       B       E       C 25.38637  9.172137         B         E         C  46.69063  31.56733
13       A       C       D 36.12545  6.323929         A         C         D  48.34187  32.36789
14       B       C       D 30.01992  7.130869         B         C         D  53.87571  33.12760
15       A       D       D 15.94151 11.756115         A         D         D  35.89909  31.76871
16       B       D       D 10.89030  6.829829         B         D         D  22.86577  32.53725
17       A       E       D 24.88410  6.108631         A         E         D  47.32549  35.22782
18       B       E       D 12.73711  8.151424         B         E         D  33.95569  36.70167

这篇关于根据不同的组合在列列表上应用函数列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆