dplyr case_when具有动态案例数 [英] dplyr case_when with dynamic number of cases
问题描述
希望使用dplyr和 case_when
将一系列指标列折叠为一个列.挑战在于,我希望能够折叠未指定/动态的列数.
Wanting to use dplyr and case_when
to collapse a series of indicator columns into a single column. The challenge is I want to be able to collapse over an unspecified/dynamic number of columns.
考虑以下数据集, gear
已分为一系列指标列.
Consider the following dataset, gear
has been split into a series of indicator columns.
library(dplyr)
data(mtcars)
mtcars = mtcars %>%
mutate(g2 = ifelse(gear == 2, 1, 0),
g3 = ifelse(gear == 3, 1, 0),
g4 = ifelse(gear == 4, 1, 0)) %>%
select(g2, g3, g4)
我正在尝试编写一个与之相反的函数.
I am trying to write a function that does the reverse.
当我知道有多少种情况时,可以执行以下操作:
When I know how many cases this can be done as follows:
combine_indices = function(db, cols, vals){
db %>% mutate(new_col = case_when(!!sym(cols[1]) == 1 ~ vals[1],
!!sym(cols[2]) == 1 ~ vals[2],
!!sym(cols[3]) == 1 ~ vals[3]))
}
cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)
但是,我希望 combine_indices
函数能够处理任意数量的索引列(现在它正好适用于三个索引列).
However, I would like the combine_indices
function to handle any number of index columns (right now it works for exactly three).
根据文档(?case_when
),如果您的模式存储在列表中,则可以使用 !!!
进行拼接".但我无法正常工作:
According to the documentation (?case_when
), "if your patterns are stored in a list, you can splice that in with !!!
". But I can not get this working:
patterns = list(sym(cols[1] == 1 ~ vals[1],
sym(cols[2] == 1 ~ vals[2],
sym(cols[3] == 1 ~ vals[3])
mtcars %>% mutate(new_col = case_when(!!!patterns))
仅产生一个充满NA的新列.
Only produces a new column filled with NAs.
如果 !!! patterns
有效,那么直接获取列表 cols
和 vals
并生成 patterns
.但是,我无法正确地获得保证.希望更熟悉quosure的人知道如何做.
If !!!patterns
worked, then it would be straightforward to take the lists cols
and vals
and generate patterns
. However, I can not get the quosures correct. Hoping someone more familiar with quosures knows how.
注意-SO的一些类似问题是通过使用联接或其他函数解决的.但是,由于使用dbplyr时它如何转换为sql,因此我只能使用 case_when
.
Note - some similar questions here of SO were solved using joins or other functions. However, I am restricted to using case_when
because of how it translates to sql when using dbplyr.
推荐答案
我们可以创建条件字符串,使用 parse_exprs
进行拼接( !!!
).
We can create a string of conditions, use parse_exprs
and splice it (!!!
).
library(dplyr)
library(rlang)
combine_indices = function(db, cols, vals){
db %>% mutate(new_col = case_when(!!!parse_exprs(paste(cols, '== 1 ~', vals))))
}
cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)
返回:
# g2 g3 g4 new_col
#1 0 0 1 4
#2 0 0 1 4
#3 0 0 1 4
#4 0 1 0 3
#5 0 1 0 3
#6 0 1 0 3
#....
其中 paste
动态生成 case_when
的条件.
paste(cols, '== 1 ~', vals)
#[1] "g2 == 1 ~ 2" "g3 == 1 ~ 3" "g4 == 1 ~ 4"
这篇关于dplyr case_when具有动态案例数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!