R - 将 data.frame 中的值/因子分配给以其他列的值为条件的列 [英] R - Assign a value/factor in a data.frame to column conditioned on value(s) of other columns

查看:29
本文介绍了R - 将 data.frame 中的值/因子分配给以其他列的值为条件的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

set.seed(8)
df <- data.frame(n = rnorm(5,1), m = rnorm(5,0), l = factor(LETTERS[1:5]))

我是否可以在 df 中创建一个以 n、m 和 l 的值或值组合为条件的新列.例如制作一个向量 level 并根据 的值为其分配 lowmediumhighnm(伪代码):

Have can I make a new column in df conditioned on values or combination of values of n, m and l. For instance make a vector level and assign it low, medium and high based on values of both n and m (pseudo-code):

df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium")

这应该给出:

df$level

#low medium low low medium 

或者,如果我想根据 l 列和 n 中的值为 level 分配一个值(再次,伪 -代码):

Or if I would like to assign a value to level based on the l column and a value in n (again, pseudo-code):

df$level <- ifelse(df$n < 1 & df$l == c("A", "B"), "low A/B", "high").

在这种情况下应该得到:

In this case one should get:

df$level

#"low A/B" "high" "high" "high" "high"

推荐答案

你也可以这样做:

 c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
#[1] "low"    "medium" "low"    "low"    "medium"

c("high", "low A/B")[(df$n <1 &grepl("A|B", df$l)) +1]
#[1] "low A/B" "high"    "high"    "high"    "high"   

说明

  • df[,-3] 获取数字列的子集,即 nm
  • df[,-3] <1 给出TRUE 的逻辑索引,如果元素是<,则FALSE1 与否.
  • 通过对上面的rowSums进行操作,它给出了三个可能的值——0、1、2,基于每行中对应的值是否都>1,一个值<1,并且两者都<1.

    Explanation

    • df[,-3] gets the subset of numeric columns i.e. n and m
    • df[,-3] <1 gives a logical index of TRUE, FALSE if the element is <1 or not.
    • By doing rowSums on the above, it gives three possible values - 0, 1, 2 based on whether the corresponding values in each row are both >1, one value <1, and both <1.

      rowSums(df[,-3] <1) #in this example, there are no values equal to 0
      #[1] 2 1 2 2 1
      

    • +1 以上将给我们

      rowSums(df[,-3] <1) +1
      #[1] 3 2 3 3 2
      

    • 使用以上作为数字索引,我们可以:

    • Using the above as numeric index, we can do:

        c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
        #[1] "low"    "medium" "low"    "low"    "medium"
      

    • low 将占据2上数值3medium的位置,如果有是 1,high 应该占据那个.

    • low will occupy the places of numeric value 3, medium on 2 and if there was 1, high should occupy that.

      这篇关于R - 将 data.frame 中的值/因子分配给以其他列的值为条件的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆