R中的部分动物字符串匹配 [英] Partial animal string matching in R
问题描述
我有一个数据框,
d<-data.frame(name=c("brown cat", "blue cat", "big lion", "tall tiger",
"black panther", "short cat", "red bird",
"short bird stuffed", "big eagle", "bad sparrow",
"dog fish", "head dog", "brown yorkie",
"lab short bulldog"), label=1:14)
我想搜索name
列,如果单词
出现猫",狮子",老虎"和豹",我想将字符串feline
分配给新列和对应的行species
.
I'd like to search the name
column and if the words
"cat", "lion", "tiger", and "panther" appear, I want to assign the character string feline
to a new column and corresponding row species
.
如果出现单词"bird", "eagle", and "sparrow"
,我想将字符串avian
分配给新列和对应的行species
.
If the words "bird", "eagle", and "sparrow"
appear, I want to assign the character string avian
to a new column and corresponding row species
.
如果出现单词"dog","yorkie"和"bulldog",我想将字符串canine
分配给新列和对应的行species
.
If the words "dog", "yorkie", and "bulldog" appear, I want to assign the character string canine
to a new column and corresponding row species
.
理想情况下,我会将其存储在列表或可以在脚本开头保留的类似内容中,因为随着种类的新变体显示在名称类别中,可以轻松访问更新符合feline
,avian
和canine
的条件.
Ideally, I'd store this in a list or something similar that I can keep at the beginning of the script, because as new variants of the species show up in the name category, it would be nice to have easy access to update what qualifies as a feline
, avian
, and canine
.
This question is almost answered here (How to create new column in dataframe based on partial string matching other column in R), but it doesn't address the multiple name twist that is present in this problem.
推荐答案
也许有比这更优雅的解决方案,但是您可以将grep
与|
结合使用以指定其他匹配项.
There may be a more elegant solution than this, but you could use grep
with |
to specify alternative matches.
d[grep("cat|lion|tiger|panther", d$name), "species"] <- "feline"
d[grep("bird|eagle|sparrow", d$name), "species"] <- "avian"
d[grep("dog|yorkie", d$name), "species"] <- "canine"
我假设您的意思是禽鸟",而省略了牛头犬",因为它包含狗".
I've assumed you meant "avian", and left out "bulldog" since it contains "dog".
您可能想将ignore.case = TRUE
添加到grep.
You might want to add ignore.case = TRUE
to the grep.
输出:
# name label species
#1 brown cat 1 feline
#2 blue cat 2 feline
#3 big lion 3 feline
#4 tall tiger 4 feline
#5 black panther 5 feline
#6 short cat 6 feline
#7 red bird 7 avian
#8 short bird stuffed 8 avian
#9 big eagle 9 avian
#10 bad sparrow 10 avian
#11 dog fish 11 canine
#12 head dog 12 canine
#13 brown yorkie 13 canine
#14 lab short bulldog 14 canine
这篇关于R中的部分动物字符串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!