将连续数值转换为由间隔定义的离散类别 [英] Convert continuous numeric values to discrete categories defined by intervals

查看:75
本文介绍了将连续数值转换为由间隔定义的离散类别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有连续数字变量的数据框,年龄以月为单位(age_mnths).我想创建一个新的离散变量,并根据年龄间隔设置年龄类别.

I have a data frame with a continuous numeric variable, age in months (age_mnths). I want to make a new discrete variable, with age categories based on age intervals.

# Some example data
rota2 <- data.frame(age_mnth = 1:170)

我已经创建了基于ifelse的过程(如下),但是我相信有可能提供更优雅的解决方案.

I've created ifelse based procedure (below), but I believe there is a possibility for more elegant solution.

rota2$age_gr<-ifelse(rota2$age_mnth < 6, rr2 <- "0-5 mnths",

   ifelse(rota2$age_mnth > 5 & rota2$age_mnth < 12, rr2 <- "6-11 mnths",

          ifelse(rota2$age_mnth > 11 & rota2$age_mnth < 24, rr2 <- "12-23 mnths",

                 ifelse(rota2$age_mnth > 23 & rota2$age_mnth < 60, rr2 <- "24-59 mnths",

                        ifelse(rota2$age_mnth > 59 & rota2$age_mnth < 167, rr2 <- "5-14 yrs",

                              rr2 <- "adult")))))

我知道有cut个函数,但是出于离散化/分类的目的,我无法对其进行处理.

I know there is cut function but I couldn't deal with it for my purpose to discretize / categorize.

推荐答案

如果有某种原因您不想使用cut,那么我不明白为什么. cut可以很好地满足您的需求

If there is a reason you don't want to use cut then I don't understand why. cut will work fine for what you want to do

# Some example data
rota2 <- data.frame(age_mnth = 1:170)
# Your way of doing things to compare against
rota2$age_gr<-ifelse(rota2$age_mnth<6,rr2<-"0-5 mnths",
                     ifelse(rota2$age_mnth>5&rota2$age_mnth<12,rr2<-"6-11 mnths",
                            ifelse(rota2$age_mnth>11&rota2$age_mnth<24,rr2<-"12-23 mnths",
                                   ifelse(rota2$age_mnth>23&rota2$age_mnth<60,rr2<-"24-59 mnths",
                                          ifelse(rota2$age_mnth>59&rota2$age_mnth<167,rr2<-"5-14 yrs",
                                                 rr2<-"adult")))))

# Using cut
rota2$age_grcut <- cut(rota2$age_mnth, 
                       breaks = c(-Inf, 6, 12, 24, 60, 167, Inf), 
                       labels = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "adult"), 
                       right = FALSE)

这篇关于将连续数值转换为由间隔定义的离散类别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆