根据数字的不同范围,将数字映射到R中的分类值 [英] Map numerics to categorical values in R, based on different ranges for the numerics

查看:140
本文介绍了根据数字的不同范围,将数字映射到R中的分类值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望我的头衔有意义。我有一个带有一列数字值的数据框,我想使用此列创建一个新列,从而根据其值将数字值映射到不同的存储桶中。下面是一些测试数据,以及我目前用于解决此问题的边缘粗糙的ifelse()方法。我希望以一种不涉及嵌套ifelse()语句的更好的方式对此进行编码,因为这种方法在许多存储桶中无法很好地扩展:

Hope my title makes sense. I have a dataframe with a column of numeric values, and I would like to use this column to create a new column whereby the numeric values are 'mapped' to different buckets based on their values. Below is some test data, as well as a rough-around-the-edges nested ifelse() approach that I am currently using to solve this problem. I am hoping to code this in a better way that doesn't involve nested ifelse() statements, since this approach doesn't scale well for many buckets:

mydf = data.frame(strings = letters[1:10], 
              numerics = c(0.2, 0.4, 1.3, 5.2, 3.3, 2.1, 7.3, 1.1, 4.3, 8.3),
              stringsAsFactors = FALSE)

这是我的测试数据框,这是我嵌套的ifelse ()解决我的问题的方法:

Here is my test dataframe, and here is my nested ifelse() approach to solving my problem:

mydf$buckets = ifelse(mydf$numerics <= 2, 0, 
                   ifelse(mydf$numerics <= 4, 1, 
                       ifelse(mydf$numerics <= 5, 2, 
                            ifelse(mydf$numerics <= 7, 3, 4))))

上面的代码所做的是将数值列中的值映射如下:

What the above code does is maps values in the numeric column as follows:


  • 所有值<2变为0

  • 所有值<4变为1

  • 所有值< 5转到2

  • 所有val ues< 7转到3

  • 所有值> = 7转到4

  • all values <2 go to 0
  • all values <4 go to 1
  • all values <5 go to 2
  • all values <7 go to 3
  • all values >= 7 to go 4

此这种方法不能很好地扩展少数几个存储桶。任何帮助,不胜感激!谢谢,

this approach doesn't scale well for more than a small number of buckets. any help with this is appreciated! Thanks,

推荐答案

我真的很喜欢在这种情况下使用 case_when 正如@tictocchoc在评论中已经提到的:


I really like using case_when in this sort of situation as already mentioned by @tictocchoc in the comments:

suppressPackageStartupMessages(library(tidyverse))

mydf = data.frame(strings = letters[1:10], 
                  numerics = c(0.2, 0.4, 1.3, 5.2, 3.3, 2.1, 7.3, 1.1, 4.3, 8.3),
                  stringsAsFactors = FALSE)

mydf %>%
  mutate(buckets = case_when(
    numerics < 2 ~0,
    numerics < 4 ~1,
    numerics < 5 ~2,    
    numerics < 7 ~3,
    numerics >= 7 ~4
  ))
#>    strings numerics buckets
#> 1        a      0.2       0
#> 2        b      0.4       0
#> 3        c      1.3       0
#> 4        d      5.2       3
#> 5        e      3.3       1
#> 6        f      2.1       1
#> 7        g      7.3       4
#> 8        h      1.1       0
#> 9        i      4.3       2
#> 10       j      8.3       4

这篇关于根据数字的不同范围,将数字映射到R中的分类值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆