R-将data.frame中的值/因子分配给以其他列的值为条件的列 [英] R - Assign a value/factor in a data.frame to column conditioned on value(s) of other columns

查看:113
本文介绍了R-将data.frame中的值/因子分配给以其他列的值为条件的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

set.seed(8)
df <- data.frame(n = rnorm(5,1), m = rnorm(5,0), l = factor(LETTERS[1:5]))

我能否在df中以n,m和l的值或值的组合为条件创建新列. 例如,创建一个向量level并根据nm(伪代码)的值分别为其分配lowmediumhigh:

Have can I make a new column in df conditioned on values or combination of values of n, m and l. For instance make a vector level and assign it low, medium and high based on values of both n and m (pseudo-code):

df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium")

这应该给出:

df$level

#low medium low low medium 

或者如果我想基于l列为level分配一个值,并为n分配一个值(同样,伪代码):

Or if I would like to assign a value to level based on the l column and a value in n (again, pseudo-code):

df$level <- ifelse(df$n < 1 & df$l == c("A", "B"), "low A/B", "high").

在这种情况下,您应该得到:

In this case one should get:

df$level

#"low A/B" "high" "high" "high" "high"

推荐答案

您也可以这样做:

 c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
#[1] "low"    "medium" "low"    "low"    "medium"

c("high", "low A/B")[(df$n <1 &grepl("A|B", df$l)) +1]
#[1] "low A/B" "high"    "high"    "high"    "high"   

说明

  • df[,-3]获取数字列的子集,即nm
  • df[,-3] <1给出TRUE的逻辑索引,如果元素不是<1,则给出FALSE.
  • 通过在上面执行rowSums,它基于每行中的对应值是否都大于1,一个值< 1和都小于< 3,给出三个可能的值-0、1、2. 1.

    Explanation

    • df[,-3] gets the subset of numeric columns i.e. n and m
    • df[,-3] <1 gives a logical index of TRUE, FALSE if the element is <1 or not.
    • By doing rowSums on the above, it gives three possible values - 0, 1, 2 based on whether the corresponding values in each row are both >1, one value <1, and both <1.

      rowSums(df[,-3] <1) #in this example, there are no values equal to 0
      #[1] 2 1 2 2 1
      

    • 上面的
    • +1将给我们

    • +1 to the above will give us

      rowSums(df[,-3] <1) +1
      #[1] 3 2 3 3 2
      

    • 使用以上内容作为数字索引,我们可以这样做:

    • Using the above as numeric index, we can do:

        c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
        #[1] "low"    "medium" "low"    "low"    "medium"
      

    • low将占据数值3medium2上的位置,如果为1,则high应该占据该位置.

    • low will occupy the places of numeric value 3, medium on 2 and if there was 1, high should occupy that.

      这篇关于R-将data.frame中的值/因子分配给以其他列的值为条件的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆