与r中的dataframe列完全匹配的文本 [英] Exact Matching text with dataframe column in r
问题描述
我在R中有一个单词向量:
I have a vector of words in R:
words = c("Awesome","Loss","Good","Bad")
我在R中有以下数据框:
And I have the following dataframe in R:
df <- data.frame(ID = c(1,2,3),
Response = c("Today is an awesome day",
"Yesterday was a bad day,but today it is good",
"I have losses today"))
我想做的是应该提取出与响应"列中完全匹配的单词,并将其插入到数据框中的新列中.最终输出应该像这样
What I want to do is words that are exact matching in Response column should be extracted and inserted into new column in dataframe. Final output should look like this
ID Response Match
1 Today is an awesome day Awesome
2 Yesterday was a bad day Bad,Good
,but today it is good
3 I have losses today NA
我使用了以下代码:
x <- sapply(words, function(x) grepl(tolower(x), tolower(df$Response)))
将匹配的单词粘贴在一起
df$Words <- apply(x, 1, function(i) paste0(names(i)[i], collapse = ","))
但是它提供了匹配项,但不提供确切的匹配项.请帮忙.
But it is providing the match, but not the exact. Please help.
推荐答案
将第一个*apply
函数更改为两行函数.如果正则表达式变为"\\bword\\b"
,则它将捕获边界包围的单词.
Change the first *apply
function to a two lines function. If the regex becomes "\\bword\\b"
then it captures the word surrounded by boundaries.
x <- sapply(words, function(x) {
y <- paste0("\\b", x, "\\b")
grepl(tolower(y), tolower(df$Response))
})
现在运行问题中发布的第二个apply
.
Now run the second apply
as posted in the question.
df$Words <- apply(x, 1, function(i) paste0(names(i)[i], collapse = ","))
df
# ID Response Words
#1 1 Today is an awesome day Awesome
#2 2 Yesterday was a bad day,but today it is good Good,Bad
#3 3 I have losses today
对于NA
,我将使用功能is.na<-
.
is.na(df$Words) <- df$Words == ""
数据.
df <- read.table(text = "
ID Response
1 'Today is an awesome day'
2 'Yesterday was a bad day,but today it is good'
3 'I have losses today'
", header = TRUE)
words <- c("Awesome","Loss","Good","Bad")
这篇关于与r中的dataframe列完全匹配的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!