在函数中组合dplyr时出错 [英] Error when combining dplyr inside a function

查看：106 发布时间：2017/7/13 21:43:36 r function dplyr

本文介绍了在函数中组合dplyr时出错的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想弄清楚我在这里做错了什么使用以下培训数据，我使用dplyr计算一些频率：

  group.count < -  c（101,99,4） 
 data<  -  data.frame（
 by = rep（3：1，group.count），
y = rep（letters [1：3]，group.count））
 
 data％>％
 group_by（by）％>％
总结（non.miss = sum（！is.na（y）））

这给了我正在寻找的结果。但是，当我尝试做一个函数：

  res0 < -  function（x1，x2）{
 output = data％>％
 group_by（x2）％>％
 summaryize（non.miss = sum（！is.na（x1）））
} 
 
 res0（y，by）

我收到错误（索引超出界限）。任何人都可以告诉我我失踪了什么？感谢您的提前。

解决方案

我建议将数据框的名称更改为df。

 
 
 这基本上是你所做的：
  df％>％
 group_by（by）％>％ 
总结（non.miss = sum（！is.na（y）））
  
这产生：
 ＃by non.miss 
＃1 1 4 
＃2 2 99 
＃3 3 101 
  
但是要计算每组的观察次数，可以使用长度，它给出了相同的答案：
  df％>％
 group_by（by）％>％
 summaryize（non.miss = length（y））
 
 
＃by non.miss 
＃1 1 4 
＃2 2 99 
＃3 3 101 
  
或使用 tally ，这样做：
  df％>％
 group_by（by）％>％
 tally 
 
＃by n 
＃1 1 4 
＃2 2 99 
＃3 3 101 
  
现在，如果你真的想要一个函数，你可以这样做。输入将是数据帧。像这样：
  res0<  -  function（df）{
 df％>％
 group_by （by）％>％
 tally 
} 
 
 res0（df）
 
＃由n 
＃1 1 4 
＃2 2 99 
＃3 3 101 
  
这当然假定你的数据框将始终具有名为by的分组列。我意识到这些数据只是虚构的，但是避免命名列'可能是一个好主意，因为这是R中自己的功能 - 它可能会让读者的代码有点混乱。
 
I'm trying to figure out what I'm doing wrong here. Using the following training data I compute some frequencies using dplyr: 
group.count     <- c(101,99,4) 
data   <- data.frame(
    by = rep(3:1,group.count),
    y = rep(letters[1:3],group.count))

data %>%  
group_by(by) %>%
summarise(non.miss = sum(!is.na(y)))
Which gives me the outcome I'm looking for. However, when I try to do it as a function:
res0   <- function(x1,x2) {
output = data %>%  
    group_by(x2) %>%
    summarise(non.miss = sum(!is.na(x1)))
}

res0(y,by)
I get an error (index out of bounds). 
Can anybody tell me what I'm missing?

Thanks on advance. 
 解决方案 
I suggest changing the name of your dataframe to df.

This is basically what you have done:
df %>%  
  group_by(by) %>%
  summarise(non.miss = sum(!is.na(y)))
which produces this:
#  by non.miss
#1  1        4
#2  2       99
#3  3      101
but to count the number of observations per group, you could use length, which gives the same answer:
df %>%  
  group_by(by) %>%
  summarise(non.miss = length(y))


#  by non.miss
#1  1        4
#2  2       99
#3  3      101
or, use tally, which gives this:
df %>%  
  group_by(by) %>%
  tally

#  by   n
#1  1   4
#2  2  99
#3  3 101
Now, you could put that if you really wanted into a function.  The input would be the dataframe.  Like this:
res0   <- function(df) {
df %>%  
    group_by(by) %>%
    tally 
}

res0(df)

#       by   n
#1       1   4
#2       2  99
#3       3 101
This of course assumes that your dataframe will always have the grouping column named 'by'.  I realize that these data are just fictional, but avoiding naming columns 'by' might be a good idea because that is its own function in R - it may get a bit confusing reading the code with it in.

                        这篇关于在函数中组合dplyr时出错的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在函数中组合dplyr时出错 [英] Error when combining dplyr inside a function

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

在函数中组合dplyr时出错 [英] Error when combining dplyr inside a function

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭