R - 根据两个匹配条件替换数据帧中的值 [英] R - replace values in dataframe based on two matching conditions

查看:34
本文介绍了R - 根据两个匹配条件替换数据帧中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理 20 多个不同站点的空间数据列表(此处难以复制;提前抱歉).我有与每个站点关联的三个数据框;每个都有一个sample_ID"列和一些其他共享列名称.

I'm working with lists of spatial data for 20+ different sites (difficult to reproduce here; sorry in advance). I have three data frames associated with each site; each has a 'sample_ID' column and some other shared columns names.

我想要做的似乎很简单:如果两个数据框的sample_ID"值匹配并且列名匹配,请将 DF 1 中的值替换为 DF 2 的值,然后DF 3 三.示例:

What I'm trying to do seems very simple: if the 'sample_ID' values match for two data frames and the column names match, replace the value in DF 1 with that of DF 2 and DF 3 three. Example:

# DF 1:
SAMPLE_ID  CLASS_ID  CLASS  VALUE
    1         0        0      5
    2         0        0      5
    3         0        0      3
    4         0        0      6
    5         0        0      6
    6         0        0      3

# DF 2
SAMPLE_ID  REF_VAL  CLASS_ID  CLASS
    1        33        2      cloud
    2        45        3      water
    3        NA        3      water
    4        NA        4      forest

# DF 3
SAMPLE_ID  CLASS_ID  CLASS  STRATA
    5         3       NA      20
    6         3      water    19

所需的输出:

# DF 1:
SAMPLE_ID  CLASS_ID  CLASS  VALUE
    1         2      cloud    5
    2         3      water    5
    3         3      water    3
    4         4      forest   6
    5         3       NA      6
    6         3      water    3

我能想到的就是某种match索引,比如:

All I can think to do is some sort of match indexing, like:

List1$CLASS_ID <- List2$CLASS_ID[match(List1$SAMPLE_ID, List2$SAMPLE_ID)
List1$CLASS_ID <- List3$CLASS_ID[match(List1$SAMPLE_ID, List3$SAMPLE_ID)

但这不起作用;一方面,它在 nomatch 值中生成 NAs(尝试在 nomatch = 中嵌套 match ,但也没有用),但是更重要的是,我真的需要通过引用所有匹配的列名来简化它,而不是一次一个,因为实际数据有 10 多个需要替换的列.同样重要的是,我还需要传递空白 NA 值.

But this doesn't work; for one, it produces NAs in the nomatch values (attempted a nested match within the nomatch = but that didn't work either), but more importantly I really need to streamline this by referencing all the matching column names rather than going one at a time since the actual data has 10+ columns that need replacement. Also important, I need the blank NA values to transfer over as well.

有什么想法吗?

推荐答案

使用基础 R,您可以:

With base R you can do:

vars <- c("SAMPLE_ID", "CLASS_ID", "CLASS")
dt23 <- rbind(dt2[, vars], dt3[, vars])
m <- merge(dt1[, c("SAMPLE_ID","VALUE")], dt23, by="SAMPLE_ID", all.x=TRUE)

这篇关于R - 根据两个匹配条件替换数据帧中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆