“合并"行成范围(dplyr/R) [英] "binning" rows into ranges (dplyr/R)

查看:42
本文介绍了“合并"行成范围(dplyr/R)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试将数据集中的行放入箱"时遇到很多困难.例如,假设我有一个数据帧"df".与"var1"一起使用和"var2":

我想创建一个名为"var3"的新变量,遵循以下逻辑(R代码):

  1)如果var1< 5和var2< 5 ....则var3 ="a".2)如果var1在(5,10)之间,而var2在(5,10)...之间,则var3 ="b".3)如果var1>10且var2> 10....则var3 ="c". 

我发布的上一个问题(

I am having a lot of difficulty trying to place rows from a dataset into "bins". For example, suppose I have a data frame "df" with "var1" and "var2" :

I want to create a new variable called "var3" that follows this logic (R code):

1) if var1 <5 and var2<5 .... then var3 = "a"
2) if var1 between (5,10) and var2 between (5,10) .... then var3 = "b"
3) if var1 > 10 and and var2>10 .... then var3 = "c"

From a previous question I posted (If statements with multiple ranges (R)), I tried the following logic:

library(dplyr)
df %>%
  mutate(var3 = case_when(var1 < 5 & var2 < 5 ~ 'a', 
                          var1 > 5 & var1 < 10 & var2 > 5 & var2 < 10 ~ 'b', 
                          var1 >10 & var2 >10 ~ 'c'))

But when I inspect the df$var3, the logic does not seem to be correct (i.e. some entries for var3 do not have any values. note: the smallest possible value of var1 and va2 is 0).

Can someone please help me?

Thanks

UPDATE:

Sample dataset:

a <- rnorm(50,10,10)
b <- rnorm(50, 2,8)

var1 = abs(a)
var2 = abs(b)

df = data.frame(var1, var2)

解决方案

try this

library(dplyr)

set.seed(123)
df <- data.frame(var1 = round(runif(100)*20, 0),
                   var2 = round(runif(100)*20, 0))

df <- df %>% mutate(var3 = ifelse(var1 <= 5 & var2 <= 5, "a", ifelse(var1 <= 10 & var2 <= 10, "b", "c"))) 

to check

library(ggplot2)

df %>%
  ggplot() + geom_point(aes(x=var1, y= var2, color= var3))

这篇关于“合并"行成范围(dplyr/R)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆