如何从单词列表中查找 DF 中的匹配单词并在新列中返回匹配的单词 [英] How to find matching words in a DF from list of words and returning the matched words in new column
本文介绍了如何从单词列表中查找 DF 中的匹配单词并在新列中返回匹配的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个有 2 列的 DF,我有一个单词列表.
I have a DF with 2 columns and I have a list of words.
list_of_words <- c("tiger","elephant","rabbit", "hen", "dog", "Lion", "camel", "horse")
df <- tibble::tibble(page=c(12,6,9,18,2,15,81,65),
text=c("I have two pets: a dog and a hen",
"lion and Tiger are dangerous animals",
"I have tried to ride a horse",
"Why elephants are so big in size",
"dogs are very loyal pets",
"I saw a tiger in the zoo",
"the lion was eating a buffalo",
"parrot and crow are very clever birds"))
animals <- c("dog,hen", "lion,tiger", "horse", FALSE, "dog", "tiger", "lion", FALSE)
cbind(df, animals)
#> page text animals
#> 1 12 I have two pets: a dog and a hen dog,hen
#> 2 6 lion and Tiger are dangerous animals lion,tiger
#> 3 9 I have tried to ride a horse horse
#> 4 18 Why elephants are so big in size FALSE
#> 5 2 dogs are very loyal pets dog
#> 6 15 I saw a tiger in the zoo tiger
#> 7 81 the lion was eating a buffalo lion
#> 8 65 parrot and crow are very clever birds FALSE
我需要找出列表中的任何单词是否出现在 DF 的一列中.如果是,则将单词/单词返回到 DF 中的新列.这是单词列表 ->(老虎,大象,兔子,母鸡,狗,狮子,骆驼,马).这就是我的 DF 的样子我想要这样的东西
I need to find out if any of the words from list are present in one of the column of the DF or not. If yes, then return the word/words to a new column in the DF. This is the list of words ->(tiger,elephant,rabbit, hen, dog, Lion, camel, horse). This is how my DF Looks like I want something like this
推荐答案
希望对您有所帮助!
library(dplyr)
df %>%
rowwise() %>%
mutate(animals = paste(list_of_words[unlist(
lapply(list_of_words, function(x) grepl(x, text, ignore.case = T)))], collapse=",")) %>%
data.frame()
输出为:
page text animals
1 12 pets: dog & hen hen,dog
2 6 Lions and tigers are dangerous animal tiger,Lion
3 9 I have tried to ride a horse horse
4 65 parrot & crow are very clever birds
示例数据:
df <- structure(list(page = c(12, 6, 9, 65), text = structure(c(4L,
2L, 1L, 3L), .Label = c("I have tried to ride a horse", "Lions and tigers are dangerous animal",
"parrot & crow are very clever birds", "pets: dog & hen"), class = "factor")), .Names = c("page",
"text"), row.names = c(NA, -4L), class = "data.frame")
list_of_words <- c("tiger", "elephant", "rabbit", "hen", "dog", "Lion", "camel", "horse")
另一种方法:
library(data.table)
setDT(df)[, animals := paste(list_of_words[unlist(lapply(list_of_words, function(x) grepl(x, text, ignore.case = T)))], collapse = ","), by = 1:nrow(df)]
#> df
# page text animals
#1: 12 pets: dog & hen hen,dog
#2: 6 Lions and tigers are dangerous animal tiger,Lion
#3: 9 I have tried to ride a horse horse
#4: 65 parrot & crow are very clever birds
这篇关于如何从单词列表中查找 DF 中的匹配单词并在新列中返回匹配的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文