如果存在数据,请根据行ID替换为另一列中的数据 [英] If data present, replace with data from another column based on row ID

查看:99
本文介绍了如果存在数据,请根据行ID替换为另一列中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将rep1列中的数据替换为rep4.这些列中的数据与第一列中的唯一ID匹配.我想将rep1-rep4列中的数据替换为具有相应ID行的value列中的数据.因此,对于第二行"b",我想用"a"行中的相应值(在这种情况下为-400)替换"rep1"列中的"a".

I would like to replace data in columns rep1 to rep4. The data in these columns match unique ID's in the first column. I want to replace the data in columns rep1-rep4 with data in the value column with the corresponding ID row. So, for the second row "b", I want to replace "a" in the column "rep1" with the corresponding value in row "a", in this case, -400.

ID  rep1  rep2  rep3  rep4  value
a                           -400
b   a                       -300
c   a     b                 -200
d   a     b     c           -300
e   a     b     c     d     -400
f                           -400
g   f                       -400
h                           -400
i                           -200
j   k     l                 -300
k   l                       -200
l                           -300
m                           -300

使用ifelse(!is.na())似乎可以在此处执行某些操作,但是我不确定如何将rep1到rep4列中的ID数据与ID列中的相应行进行匹配,从而识别值"中的哪些数据应该用于替换.可以在同一数据帧中完成此操作,还是需要将其拆分为两个不同的数据帧才能工作?

It seems like using ifelse(!is.na()) might be able to do something here, but I'm not sure how to match the ID data in columns rep1 to rep4 to the corresponding row in the ID column, identifying what data in "value" is supposed to be used in the replacement. Can this be done in the same dataframe, or does it need to be split into two different dataframes to work?

以下是使用dput()

structure(list(ID = structure(1:13, .Label = c("a", "b", "c", 
"d", "e", "f", "g", "h", "i", "j", "k", "l", "m"), class = "factor"), 
    rep1 = structure(c(1L, 2L, 2L, 2L, 2L, 1L, 3L, 1L, 1L, 4L, 
    5L, 1L, 1L), .Label = c("", "a", "f", "k", "l"), class = "factor"), 
    rep2 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 
    1L, 1L, 1L), .Label = c("", "b", "l"), class = "factor"), 
    rep3 = structure(c(1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L), .Label = c("", "c"), class = "factor"), rep4 = structure(c(1L, 
    1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", 
    "d"), class = "factor"), value = c(-400L, -300L, -200L, -300L, 
    -400L, -400L, -400L, -400L, -200L, -300L, -200L, -300L, -300L
    )), class = "data.frame", row.names = c(NA, -13L))

推荐答案

以下是带有tidyverse的变体:

Here a variant with tidyverse:

df %>% mutate_at(vars(rep1:rep4), ~ value[match(., ID)])

说明:

  • mutate_at允许选择要修改的变量范围
  • ~ ... .(Quocus Style Lambda表示法)允许使用一个表达式,其中.(点)代表要修改的列.否则,您将不得不使用function(x) df$value[match(x, df$ID)],这要输入很多内容.
  • vars()mutate_at中是必需的,以便能够选择不带引号的列(否则,您将需要使用2:5paste0("rep", 1:4)).
  • mutate_at allows to select a range of variables to be modified
  • the ~ ... . (quosure style lambda notation) allows to use an expression in which . (dot) stands for the column to be modified. Otherwise you would have to use function(x) df$value[match(x, df$ID)], which is a lot to type.
  • vars() are necessary in mutate_at to be able to select columns without quotes (otherwise you would need to use 2:5 or paste0("rep", 1:4)).

这篇关于如果存在数据,请根据行ID替换为另一列中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆