R:将字符串粘贴为dplyr中的代码或函数参数 [英] R: paste string as code or function argument within dplyr

查看:81
本文介绍了R:将字符串粘贴为dplyr中的代码或函数参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在dplyr函数(即summarise())中粘贴字符串并将其作为代码运行?

  library('tidyverse') 
df<-tibble(ID = c('a','a','b','c','c','e','e','f','g', 'g'),
var1 = floor(runif(10,min = 0,max = 100)),
var2 = floor(runif(10,min = 0,max = 100)),
var3 = floor(runif(10,min = 0,max = 100)),
var4 = floor(runif(10,min = 0,max = 100))

样本数据

 > df 
#小动作:10 x 5
ID var1 var2 var3 var4
< chr> < dbl> < dbl> < dbl> < dbl>
1 a 82 4 21 32
2 a 90 34 12 51
3b 67 77 69 32
4 c 56 3 96 76
5 c 38 2 46 79
6 e 34 91 12 12
7 e 49 16 38 31
8 f 34 1 76 82
9 g 95 84 54 70
10 g 13 53 65 79

替换此

  df%> ;%
group_by(ID)%>%
summarise(var1 = sum(var1),
var2 = sum(var2),
var3 = sum(var3))

使用此

 #定义字符串替换命令行
sum_var的向量--select(df,starts_with('var'))%&%;%names()
sum_var_str<-paste0(sum_var, = sum(; ,sum_var,)'')
sum_var_str<-str_c(sum_var_str,崩溃=,)
> sum_var
[1] var1; var2 var3 var4
> sum_var_str
[1] var1 = sum(var1),var2 = sum(var2),var3 = sum(var3),var4 = sum(var4)

#运行代码带有字符串
df%>%
group_by(ID)%&%;%
summarise(sum_var_str)#此行不起作用

我已经尝试



  • summarise(!! parse_quosure(sum_var_str))

  • summarise(parse(text = sum_var_str))


我想念什么?


谢谢


#---------------如果您质疑我为什么这样做? ---------


我想使用multidplyr,并且summarise_at尚未使用任何东西。
i有数百个(即使不是数千个),因此summarise_at是必需的,但不幸的是,multidplyr中没有。


正在寻找替代方案来解决它。

  library('multidplyr')
集群<-new_cluster(5)

#works
df%> %
group_by(ID)%>%
#partition(cluster)%>%
summarise_at(.vars = vars(starts_with('var')),sum)
#collect()

#works
df%>%
group_by(ID)%&%;%
partition(cluster)%&%;%
summarise(var1 = sum(var1),
var2 = sum(var2),
var3 = sum(var3))%&%;%
collect()

#不工作
df%>%
group_by(ID)%&%;%
分区(群集)%&%;%
summarise_at(.vars = vars(starts_with ('var')),sum)%&%;%
collect()

UseMethod( group_vars)中的错误:
没有适用于'group_vars'的适用方法到 multidplyr_party_df&类的对象quot;

#我想看看是否可行
df%>%
group_by(ID)%&%;%
partition(cluster)%&%;%
summarise(parse(text = sum_var_str))%>%#错误的代码行
collect()


解决方案

我遇到了相同的问题,并注意到我的dplyr版本已更新为 dplyr 1.0.0 p>

您可以通过回到 dplyr 0.8.5 然后您的 summarise_at 应该可以与 party_df 对象一起正常工作。


dplyr 0.8.5 group_vars 方法可用于 multidplyr_party_df 类的对象。


所有 _at _all _if 动词已被取代通过 dplyr 1.0.0 中的 across()函数。


您可以在此处详细了解如何使用新方法:
dplyr 1.0.0更改日志


How do i paste a string within a dplyr function i.e. summarise( ) and run it as a code?

library('tidyverse')
df <- tibble(ID = c('a','a','b','c','c','e','e','f','g','g'),
              var1 = floor(runif(10, min=0, max=100)),
              var2 = floor(runif(10, min=0, max=100)),
              var3 = floor(runif(10, min=0, max=100)),
              var4 = floor(runif(10, min=0, max=100))
              )

sample data

> df
# A tibble: 10 x 5
   ID     var1  var2  var3  var4
   <chr> <dbl> <dbl> <dbl> <dbl>
 1 a        82     4    21    32
 2 a        90    34    12    51
 3 b        67    77    69    32
 4 c        56     3    96    76
 5 c        38     2    46    79
 6 e        34    91    12    12
 7 e        49    16    38    31
 8 f        34     1    76    82
 9 g        95    84    54    70
10 g        13    53    65    79

Replace this

df %>% 
  group_by(ID) %>% 
  summarise(var1 = sum(var1),
            var2 = sum(var2),
            var3 = sum(var3))

With this

#Define character string vector to replace command line
sum_var <- select(df,starts_with('var')) %>% names()
sum_var_str <- paste0(sum_var," = sum(",sum_var,")")
sum_var_str <- str_c(sum_var_str, collapse = ", ")
> sum_var
[1] "var1" "var2" "var3" "var4"
> sum_var_str
[1] "var1 = sum(var1), var2 = sum(var2), var3 = sum(var3), var4 = sum(var4)"

#run code with character string
df %>% 
  group_by(ID) %>% 
  summarise(sum_var_str) #this line doesn't work

I have tried

  • summarise(!!parse_quosure(sum_var_str))
  • summarise(parse(text =sum_var_str))

What am i missing?

thanks,

#--------------- In case you question why am i doing this? ---------

I want to use multidplyr, and it has yet to have anything for summarise_at. i have hundreds if not thousands, so the summarise_at is necessary, but unfortunately, not available in multidplyr.

looking for an alternative to work around it.

library('multidplyr')
cluster <- new_cluster(5)

#works
df %>% 
  group_by(ID) %>% 
  #partition(cluster) %>% 
  summarise_at(.vars = vars(starts_with('var')),sum) 
  #collect()

#works
df %>% 
  group_by(ID) %>% 
  partition(cluster) %>% 
  summarise(var1 = sum(var1),
            var2 = sum(var2),
            var3 = sum(var3)) %>% 
  collect()

#doesnt works
df %>% 
  group_by(ID) %>% 
  partition(cluster) %>%
  summarise_at(.vars = vars(starts_with('var')),sum)  %>% 
  collect()

Error in UseMethod("group_vars") : 
  no applicable method for 'group_vars' applied to an object of class "multidplyr_party_df"

#I want to see if this works
df %>% 
  group_by(ID) %>% 
  partition(cluster) %>%
  summarise(parse(text =sum_var_str)) %>% #incorrect line of code
  collect()

解决方案

I encountered the same issue and noticed my dplyr version was updated to dplyr 1.0.0.

You can resolve/workaround this issue by going back to dplyr 0.8.5 then your summarise_at should work fine with a party_df object.

dplyr 0.8.5's group_vars methods work with objects of class multidplyr_party_df.

All the _at,_all,_if verbs have been superseded by the across() function in dplyr 1.0.0.

You can read more about how to use the new methods here: dplyr 1.0.0 Changelog

这篇关于R:将字符串粘贴为dplyr中的代码或函数参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆