使用dplyr将列名作为参数传递给函数 [英] Passing column name as parameter to a function using dplyr

查看：64 发布时间：2021/5/2 20:44:59 r dataframe dplyr quosure

本文介绍了使用dplyr将列名作为参数传递给函数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个如下数据框:

transid<-c(1,2,3,4,5,6,7,8)
accountid<-c(a,a,b,a,b,b,a,b)
month<-c(1,1,1,2,2,3,3,3)
amount<-c(10,20,30,40,50,60,70,80)
transactions<-data.frame(transid,accountid,month,amount)

我正在尝试使用dplyr软件包动词编写每个帐户id的每月总金额的函数.

I am trying to write function for total monthly amount for each accountid using dplyr package verbs.

my_sum<-function(df,col1,col2,col3){
df %>% group_by_(col1,col2) %>%summarise_(total_sum = sum(col3))
}

my_sum(transactions, "accountid","month","amount")

要获得如下结果:

accountid   month  total_sum
a            1       30
a            2       40
a            3       70
b            1       30
b            2       50
b            3       140

我收到如下错误:-sum(col3)中的错误:参数的'类型'(字符)无效.如何在摘要函数中将列名作为参数传递而没有引号?

I am getting error like:- Error in sum(col3) : invalid 'type' (character) of argument.How to pass column name as parameter without quote in summarise function?

结果

>> transactions %>% my_sum(amount, accountid, month)
# A tibble: 6 x 3
  accountid month total_sum
     <fctr> <dbl>     <dbl>
1         a     1        30
2         a     2        40
3         a     3        70
4         b     1        30
5         b     2        50
6         b     3       140

数据

在您最初的答案中，您传递了未加注释的字符串，我已经解决了使用 Hmisc:Cs 函数，但是原则上，您应该将字符串用" 括起来；除非您当然要调用某些名为 a ， b 等的对象.从最初的问题还不清楚.

Data

In you original answer you have passed unqoted strings, I've solved that using Hmisc:Cs function but, on principle, you should surround your strings with ""; unless, of course, you are calling some objects named a, b and so forth. It wasn't clear from the original question.

使用的数据:

transid <- c(1, 2, 3, 4, 5, 6, 7, 8)
accountid <- Hmisc::Cs(a, a, b, a, b, b, a, b)
month <- c(1, 1, 1, 2, 2, 3, 3, 3)
amount <- c(10, 20, 30, 40, 50, 60, 70, 80)
transactions <- data.frame(transid, accountid, month, amount)

注释

如果您查看捕获多个变量部分> 使用 dplyr 进行编程，您将看到使用
Notes
- If you look at the Capturing multiple variables section of the Programming with dplyr article you will see that very similar problem is solved with use of quos() function. In effect, your task is a perfect example how the quos() function should be used.
  
  省略号 ... 然后应该放在结尾，因为假设该函数将用于对多列数据进行分组.自然地，如果需要，您可以在每一列中依次传递 enquo()列，依此类推，但是使用 ... 更自然并与上面链接的文章中讨论的推荐解决方案保持一致.请注意，这种方法会更改函数调用中参数的顺序，因为 ... 应该在末尾出现.
  
  The ellipsis ... should then come at the end as the assumption is that the function will be used to group data with multiple column. Naturally, if desired you you could pass columns one bye one enquo() every single column and so forth but using ... is more natural and consistent with the recommended solution discussed in the article linked above. Please note that this approach changes the order of arguments in your function call as ... should come at the end.
  
  如果您使用的是 summarise()，则不必 ungroup()您的数据与我的示例相同.例如代码:
  
  If you are using summarise() you don't have to ungroup() your data as in my example. For instance the code:
```
mtcars %>% group_by(am) %>% summarise(mean_disp = mean(disp)) %>% mutate(am = am + 1) 
```
  将起作用；而代码:
```
mtcars %>% group_by(am)  %>% mutate(am = am + 1)
```
  将返回预期的错误:
  
  mutate_impl(.data，点)中的错误:无法修改列 am 因为它是分组变量
  
  如果要对原始数据进行 mutate()或进行其他操作以保持分组变量完整，则应使用 ungroup().传递分组变量以后可能会证明是有问题的，它会说这主要是您的 dplyr 工作流程中的品味/顺序问题.如果您和其他函数用户要记住该小标题可能带有分组变量，那么就没有问题了.就我个人而言，我往往会忘记这一点，因此，如果我对携带分组变量不感兴趣，我更倾向于 ungroup()数据.
  
  You should use ungroup() if you are going to mutate() your original data or do other operations that keep your grouping variable intact. passing grouped variable may later prove problematic, it would say it's mostly a matter of taste/order in your dplyr workflow. If you and other function users are going to remember that the tibble may be carrying grouping variable then there is no issue; personally, I tend to forget about that so my preference is to ungroup() the data if I'm not interested in carrying grouping variable.
  
  这篇关于使用dplyr将列名作为参数传递给函数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用dplyr将列名作为参数传递给函数 [英] Passing column name as parameter to a function using dplyr

问题描述

推荐答案

结果

数据

Data

注释

Notes

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用dplyr将列名作为参数传递给函数 [英] Passing column name as parameter to a function using dplyr

问题描述

推荐答案

结果

数据

Data

注释

Notes

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭