dplyr case_when具有动态案例数 [英] dplyr case_when with dynamic number of cases

查看:70
本文介绍了dplyr case_when具有动态案例数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望使用dplyr和 case_when 将一系列指标列折叠为一个列.挑战在于,我希望能够折叠未指定/动态的列数.

Wanting to use dplyr and case_when to collapse a series of indicator columns into a single column. The challenge is I want to be able to collapse over an unspecified/dynamic number of columns.

考虑以下数据集, gear 已分为一系列指标列.

Consider the following dataset, gear has been split into a series of indicator columns.

library(dplyr)
data(mtcars)
mtcars = mtcars %>%
  mutate(g2 = ifelse(gear == 2, 1, 0),
         g3 = ifelse(gear == 3, 1, 0),
         g4 = ifelse(gear == 4, 1, 0)) %>%
  select(g2, g3, g4)

我正在尝试编写一个与之相反的函数.

I am trying to write a function that does the reverse.

当我知道有多少种情况时,可以执行以下操作:

When I know how many cases this can be done as follows:

combine_indices = function(db, cols, vals){
  db %>% mutate(new_col = case_when(!!sym(cols[1]) == 1 ~ vals[1],
                                    !!sym(cols[2]) == 1 ~ vals[2],
                                    !!sym(cols[3]) == 1 ~ vals[3]))
}

cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)

但是,我希望 combine_indices 函数能够处理任意数量的索引列(现在它正好适用于三个索引列).

However, I would like the combine_indices function to handle any number of index columns (right now it works for exactly three).

根据文档(?case_when ),如果您的模式存储在列表中,则可以使用 !!! 进行拼接".但我无法正常工作:

According to the documentation (?case_when), "if your patterns are stored in a list, you can splice that in with !!!". But I can not get this working:

patterns = list(sym(cols[1] == 1 ~ vals[1],
                sym(cols[2] == 1 ~ vals[2],
                sym(cols[3] == 1 ~ vals[3])

mtcars %>% mutate(new_col = case_when(!!!patterns))

仅产生一个充满NA的新列.

Only produces a new column filled with NAs.

如果 !!! patterns 有效,那么直接获取列表 cols vals 并生成 patterns .但是,我无法正确地获得保证.希望更熟悉quosure的人知道如何做.

If !!!patterns worked, then it would be straightforward to take the lists cols and vals and generate patterns. However, I can not get the quosures correct. Hoping someone more familiar with quosures knows how.

注意-SO的一些类似问题是通过使用联接或其他函数解决的.但是,由于使用dbplyr时它如何转换为sql,因此我只能使用 case_when .

Note - some similar questions here of SO were solved using joins or other functions. However, I am restricted to using case_when because of how it translates to sql when using dbplyr.

推荐答案

我们可以创建条件字符串,使用 parse_exprs 进行拼接( !!! ).

We can create a string of conditions, use parse_exprs and splice it (!!!).

library(dplyr)
library(rlang)

combine_indices = function(db, cols, vals){
   db %>% mutate(new_col = case_when(!!!parse_exprs(paste(cols, '== 1 ~', vals))))
}


cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)

返回:

#   g2 g3 g4 new_col
#1   0  0  1       4
#2   0  0  1       4
#3   0  0  1       4
#4   0  1  0       3
#5   0  1  0       3
#6   0  1  0       3
#....

其中 paste 动态生成 case_when 的条件.

paste(cols, '== 1 ~', vals)
#[1] "g2 == 1 ~ 2" "g3 == 1 ~ 3" "g4 == 1 ~ 4"

这篇关于dplyr case_when具有动态案例数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆