检测每个数据帧行和向量中的值之间的至少一个匹配项 [英] Detect at least one match between each data frame row and values in vector

查看:32
本文介绍了检测每个数据帧行和向量中的值之间的至少一个匹配项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据框如下:

  x1 <-c("a","c","f","j")x2 <-c("b","c","g","k")x3 <-c("b","d","h",NA)x4<-c("a","e","i",NA)df <-data.frame(x1,x2,x3,x4,stringsAsFactors = F)dfx1 x2 x3 x41 a b b a2 c c d e3小时4 j k< NA>< NA> 

现在我有一个任意向量:

  vec< -c("a","i","s","t","z") 

我想将向量值与数据帧中的每一行进行比较,并创建一个额外的列,指示是否找到了至少一个向量值.>

结果数据框应如下所示:

 <代码> x1 x2 x3 x4 valueFound1 a b b a 12 c c d e 03 f g h i 14 j k< NA>< NA>0 

我想做到不循环.非常感谢您的支持!

拉米

解决方案

这里是执行此操作的一种方法:

  df $ valueFound<-apply(df,1,function(x){if(any(x%in%vec)){1个} 别的 {0}})##>dfx1 x2 x3 x4值找到1 a b b a 12 c c d e 03 f g h i 14 j k< NA>< NA>0 

感谢@David Arenburg和@CathG,这是两种更为简洁的方法:

  • apply(df,1,function(x)any(x%in%vec)+ 0)
  • apply(df,1,function(x)as.numeric(any(x%in%vec)))

有趣的是,还有其他一些有趣的变体:

  • apply(df,1,function(x)any(x%in%vec)%/%TRUE)
  • apply(df,1,function(x)cumprod(any(x%in%vec)))

My dataframe looks like this:

x1 <- c("a", "c", "f", "j")
x2 <- c("b", "c", "g", "k")
x3 <- c("b", "d", "h", NA)
x4 <- c("a", "e", "i", NA)
df <- data.frame(x1, x2, x3, x4, stringsAsFactors=F)

df

x1 x2   x3   x4
1  a  b    b    a
2  c  c    d    e
3  f  g    h    i
4  j  k <NA> <NA>

Now I have an arbitrary vector:

vec <- c("a", "i", "s", "t", "z")

I would like to compare the vector values with each row in the data frame and create an additional column that indicates whether at least one (ANY) of the vector values was found or not.

The resulting dataframe should look like this:

  x1 x2   x3   x4 valueFound
1  a  b    b    a          1
2  c  c    d    e          0
3  f  g    h    i          1
4  j  k <NA> <NA>          0

I would like to do it without looping. Thank you very much for your support!

Rami

解决方案

Here's one way to do this:

df$valueFound <- apply(df,1,function(x){
  if(any(x %in% vec)){ 
    1 
  } else {
    0
  }
})
##
> df
  x1 x2   x3   x4 valueFound
1  a  b    b    a          1
2  c  c    d    e          0
3  f  g    h    i          1
4  j  k <NA> <NA>          0

Thanks to @David Arenburg and @CathG, a couple of more concise approaches:

  • apply(df, 1, function(x) any(x %in% vec) + 0)
  • apply(df, 1, function(x) as.numeric(any(x %in% vec)))

Just for fun, a couple of other interesting variants:

  • apply(df, 1, function(x) any(x %in% vec) %/% TRUE)
  • apply(df, 1, function(x) cumprod(any(x %in% vec)))

这篇关于检测每个数据帧行和向量中的值之间的至少一个匹配项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆