根据数字的不同范围,将数字映射到R中的分类值 [英] Map numerics to categorical values in R, based on different ranges for the numerics
问题描述
希望我的头衔有意义。我有一个带有一列数字值的数据框,我想使用此列创建一个新列,从而根据其值将数字值映射到不同的存储桶中。下面是一些测试数据,以及我目前用于解决此问题的边缘粗糙的ifelse()方法。我希望以一种不涉及嵌套ifelse()语句的更好的方式对此进行编码,因为这种方法在许多存储桶中无法很好地扩展:
Hope my title makes sense. I have a dataframe with a column of numeric values, and I would like to use this column to create a new column whereby the numeric values are 'mapped' to different buckets based on their values. Below is some test data, as well as a rough-around-the-edges nested ifelse() approach that I am currently using to solve this problem. I am hoping to code this in a better way that doesn't involve nested ifelse() statements, since this approach doesn't scale well for many buckets:
mydf = data.frame(strings = letters[1:10],
numerics = c(0.2, 0.4, 1.3, 5.2, 3.3, 2.1, 7.3, 1.1, 4.3, 8.3),
stringsAsFactors = FALSE)
这是我的测试数据框,这是我嵌套的ifelse ()解决我的问题的方法:
Here is my test dataframe, and here is my nested ifelse() approach to solving my problem:
mydf$buckets = ifelse(mydf$numerics <= 2, 0,
ifelse(mydf$numerics <= 4, 1,
ifelse(mydf$numerics <= 5, 2,
ifelse(mydf$numerics <= 7, 3, 4))))
上面的代码所做的是将数值列中的值映射如下:
What the above code does is maps values in the numeric column as follows:
- 所有值<2变为0
- 所有值<4变为1
- 所有值< 5转到2
- 所有val ues< 7转到3
- 所有值> = 7转到4
- all values <2 go to 0
- all values <4 go to 1
- all values <5 go to 2
- all values <7 go to 3
- all values >= 7 to go 4
此这种方法不能很好地扩展少数几个存储桶。任何帮助,不胜感激!谢谢,
this approach doesn't scale well for more than a small number of buckets. any help with this is appreciated! Thanks,
推荐答案
我真的很喜欢在这种情况下使用 case_when
正如@tictocchoc在评论中已经提到的:
I really like using case_when
in this sort of situation as already mentioned by @tictocchoc in the comments:
suppressPackageStartupMessages(library(tidyverse))
mydf = data.frame(strings = letters[1:10],
numerics = c(0.2, 0.4, 1.3, 5.2, 3.3, 2.1, 7.3, 1.1, 4.3, 8.3),
stringsAsFactors = FALSE)
mydf %>%
mutate(buckets = case_when(
numerics < 2 ~0,
numerics < 4 ~1,
numerics < 5 ~2,
numerics < 7 ~3,
numerics >= 7 ~4
))
#> strings numerics buckets
#> 1 a 0.2 0
#> 2 b 0.4 0
#> 3 c 1.3 0
#> 4 d 5.2 3
#> 5 e 3.3 1
#> 6 f 2.1 1
#> 7 g 7.3 4
#> 8 h 1.1 0
#> 9 i 4.3 2
#> 10 j 8.3 4
这篇关于根据数字的不同范围,将数字映射到R中的分类值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!