如何在函数中优化case_when? [英] How to optimize case_when in a function?

查看:85
本文介绍了如何在函数中优化case_when?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想编写一个基于一些原始数据创建装箱变量的函数.具体来说,我有一个日期集,其中包含每个受访者的年龄值,我想编写一个函数将该人分类为年龄组,其中年龄组是该函数的参数.

I would like to write a function that creates a binning variable based on some raw data. Specifically, I have a dateset with the age values for each respondent and I would like to write a function that classifies that person into an age group, where the age group is a parameter of that function.

这就是我的开始:

data <- data.frame(age = 18:100)

foo <- function(data, brackets = list(18:24, 25:34, 35:59)) {
  require(tidyverse)
  tmp <- data %>%
    drop_na(age) %>%
    mutate(age_bracket = case_when(age %in% brackets[[1]] ~ paste(brackets[[1]][1], "to", brackets[[1]][length(brackets[[1]])]),
                                   age %in% brackets[[2]] ~ paste(brackets[[2]][1], "to", brackets[[2]][length(brackets[[2]])]),
                                   age %in% brackets[[3]] ~ paste(brackets[[3]][1], "to", brackets[[3]][length(brackets[[3]])])))
print(tmp)
}

很明显,case_when部分非常不灵活,因为我必须提前指定括号的数量.它也很长.我想编写某种循环,以查看方括号参数中元素的数量,并相应地创建这些方括号.因此,如果我想添加一个60:Inf年龄组,则该功能应添加另一个年龄组.

As is obvious, the case_when part is very inflexible as I have to specify ahead of time the number of brackets. It is also quite lengthy. I would like to write some sort of loop that looks at the number of elements in the brackets argument and creates these brackets accordingly. So if I wanted to add a 60:Inf age group, the function should add another age group.

在线搜索后,我发现一些

After searching online, I found that some use defused expressions (e.g. quos). I am quite unfamiliar with those, so I struggle to use them for my purpose.

推荐答案

我认为您正在寻找 cut 函数.以下是工作:

I think you are looking for the cut function. The following makes the job:

data <- data.frame(age = 18:100)

data$age_bracket <- cut(data$age, breaks = c(0, 18, 25, 35, 60, Inf))

unique(data$age_bracket)
# [1] (0,18]   (18,25]  (25,35]  (35,60]  (60,Inf]
# Levels: (0,18] (18,25] (25,35] (35,60] (60,Inf]

如果不链接方括号默认标签,也可以定义标签.使用 cut 而不是手工编码的解决方案的优点是,您可以对 cut

You can also define labels if you don't link brackets default labels. The advantage of using cut rather than hand-coded solution is that you make usual operations (e.g. ordering) with the output of cut

这篇关于如何在函数中优化case_when?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆