添加更多的参数来总结在dplyr [英] add more argument to summarise in dplyr

查看:91
本文介绍了添加更多的参数来总结在dplyr的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是除了在多个参数之外定义一个函数,还有另外一种方法可以直接在summaryise_each中添加参数?

my question is besides defining a function outside of summarise_each with multiple argument, is there another way to add the argument directly in the summarise_each?

例如我想得到没有NAs的平均值。这样做的方式

For example I want to get the mean without NAs.this way works

mean_fun=function(x)mean(x,na.rm=TRUE)
AA_group=AA_new %>% group_by(tractID)
AA_group %>% summarise_each(funs(mean_fun))

我想知道是否有一种方法将 na.rm = TRUE 直接添加到 summarise_each ,如 more_args 选项?

I am wondering whether there is a way to add na.rm=TRUE directly to summarise_each,such as more_args option?

,如果我把mean_fun直接放在summaryise_each, p>

and also if I put mean_fun directly to summarise_each namely,

AA_group %>% summarise_each(funs(function(x)mean(x,na.rm=TRUE)))

,错误是

expecting a single value

这是否意味着每次我们想使用summarise_each时,我们必须定义一个函数在这之外?

Does that mean that every time we want to use summarise_each, we have to define a function outside of that?

推荐答案

我猜你在寻找,如?funs 中的记录。

I'm guessing you're looking for ., as documented at ?funs.

这里有一个小例子,使用iris数据集,但在其中添加一些 NA 值。

Here's a small example, using the "iris" dataset, but adding some NA values into it.

iris2 <- iris
set.seed(1)
iris2[-5] <- lapply(iris2[-5], function(x) {
  x[sample(length(x), sample(10, 1))] <- NA
  x
})

以下:

iris2 %>% 
  group_by(Species) %>%  
  summarise_each(funs(mean))
# Source: local data frame [3 x 5]
# 
#      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1     setosa        5.006       3.428           NA          NA
# 2 versicolor           NA          NA           NA          NA
# 3  virginica           NA          NA           NA          NA


iris2 %>% 
  group_by(Species) %>%  
  summarise_each(funs(mean_fun))
# Source: local data frame [3 x 5]
# 
#      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1     setosa     5.006000    3.428000     1.455319   0.2468085
# 2 versicolor     5.939583    2.767347     4.256250   1.3208333
# 3  virginica     6.597959    2.979167     5.547917   2.0191489

iris2 %>% 
  group_by(Species) %>%
  summarise_each(funs(mean(., na.rm = TRUE)))
# Source: local data frame [3 x 5]
# 
#      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1     setosa     5.006000    3.428000     1.455319   0.2468085
# 2 versicolor     5.939583    2.767347     4.256250   1.3208333
# 3  virginica     6.597959    2.979167     5.547917   2.0191489

这篇关于添加更多的参数来总结在dplyr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆