grep多列一种模式 [英] grep one pattern over multiple columns

查看：134 发布时间：2020/10/26 4:19:46 r dplyr grepl

本文介绍了grep多列一种模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试找出一种方法，让我在 mutate（）的多列上使用仅一个局部模式的 grepl（） 。我想要一个新列，如果一组列中的任何一个包含特定字符串，则该列将为TRUE或FALSE。

I'm trying to figure out a way for me to use grepl() of only one partial pattern over multiple columns with mutate(). I want to have a new column that will be TRUE or FALSE if ANY of a set of columns contains a certain string.

df <- structure(list(ID = c("A1.1234567_10", "A1.1234567_20"), 
                 var1 = c("NORMAL", "NORMAL"), 
                 var2 = c("NORMAL", "NORMAL"), 
                 var3 = c("NORMAL", "NORMAL"), 
                 var4 = c("NORMAL", "NORMAL"), 
                 var5 = c("NORMAL", "NORMAL"), 
                 var6 = c("NORMAL", "NORMAL"), 
                 var7 = c("NORMAL", "ABNORMAL"), 
                 var8 = c("NORMAL", "NORMAL")), 
            .Names = c("ID", "var1", "var2", "var3", "var4", "var5", "var6", "var7", "var8"), 
            class = "data.frame", row.names = c(NA, -2L))

            ID   var1   var2   var3   var4   var5   var6     var7   var8
1 A1.1234567_10 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL   NORMAL NORMAL
2 A1.1234567_20 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL ABNORMAL NORMAL

我尝试过

df$abnormal %>% mutate( abnormal = ifelse(grepl("abnormal",df[,119:131]) , TRUE, FALSE)))

以及其他大约100件事。我希望最终格式为

and about 100 other things. I want the final format to be

             ID   var1   var2   var3   var4   var5   var6     var7   var8    abnormal
1 A1.1234567_10 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL   NORMAL NORMAL FALSE
2 A1.1234567_20 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL ABNORMAL NORMAL TRUE

每当我尝试每次都出错时，就会出错

Whenever I try I get false every time

推荐答案

我可能会这样做：

temp = sapply(your_data[columns_you_want_to_check],
              function(x) grepl("suspected", x, ingore.case = TRUE))
your_data$abnormal = rowSums(temp) > 0

由于您的问题，我只是使用了您的数据在 df 和 test.file 之间切换。

I just used your_data since your question switches between df and test.file.

如果您真的想使用 mutate ，您可以

If you really want to use mutate, you could do

df %>%
mutate(abnormal = rowSums(
  sapply(select(., starts_with("var")),
         function(x) grepl("suspected", x, ingore.case = TRUE)
  )) > 0
)

如果您需要更高的效率，如果可以依靠大小写一致，则可以使用 fixed = TRUE 代替 ignore.case = TRUE 。（也许首先将所有转换为__lower（）。）

If you need more efficiency, you can use fixed = TRUE instead of ignore.case = TRUE if you can count on the case being consistent. (Maybe convert everything to_lower() first.)

放弃> 0 获取每一行的计数。

Leave off the > 0 to get the count for each row.

这篇关于grep多列一种模式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

grep多列一种模式 [英] grep one pattern over multiple columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

grep多列一种模式 [英] grep one pattern over multiple columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭