基于两列之间的匹配的数据框中的新列 [英] New column in dataframe based on match between two columns

查看:98
本文介绍了基于两列之间的匹配的数据框中的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过将x列的值与另一个数据帧(df2)中的y列的值进行比较,在现有数据帧(df1)中创建一个新列.

I am trying to create a new column in an existing dataframe (df1) by comparing the values of column x to the values of column y in a different dataframe (df2).

结果应类似于df_end.如果存在匹配项,则应返回x列的值.如果没有匹配项,则应返回NA.

The result should look like df_end. Where there is a match, the value of column x should be returned. Where there is no match, NA should be returned.

df1 <- data.frame(x = c("blue2", "blue6", "green9", "green7"))
df2 <- data.frame(y = c("blue2", "green9"))

df_end <- data.frame(x = c("blue2", "blue6", "green9", "green7"),
                     match = c("blue2", NA, "green9", NA))

我已经尝试过merge,match和if,else语句,但是我无法弄清楚.有人对我有建议吗?

I have experimented with merge, match and if,else statements, but I can't figure it out. Someone has a piece of advice for me?

#Attempt 1: Merge
df1$match <- merge(df1, df2, by.x = x, all = TRUE)

这不起作用,因为df1和df2的长度不同.

This does not work, because df1 and df2 are of different length.

推荐答案

我执行了以下操作:

df1 <- data.frame(x = c("blue2", "blue6", "green9", "green7"))
df2 <- data.frame(y = c("blue2", "green9"))

end <- sapply(df1$x, function(x) { # for each value in df1$x
  j <- which(df2$y == x) # check if df2$y has a match
  ifelse(length(j) > 0, j, NA) # if there is, give the location in the vector
}) # if not give NA

cbind(df1,df2, match = df2$y[end]) # subset the df2 with the location to get the characters

#       x      y  match
#1  blue2  blue2  blue2
#2  blue6 green9   <NA>
#3 green9  blue2 green9
#4 green7 green9   <NA>

有关最佳答案,请参见sotos的评论:df2$y[match(df1$x, df2$y)]

see sotos' comment for the best answer: df2$y[match(df1$x, df2$y)]

这篇关于基于两列之间的匹配的数据框中的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆