R中的部分动物字符串匹配 [英] Partial animal string matching in R

查看:95
本文介绍了R中的部分动物字符串匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,

d<-data.frame(name=c("brown cat", "blue cat", "big lion", "tall tiger",
                     "black panther", "short cat", "red bird",
                     "short bird stuffed", "big eagle", "bad sparrow",
                     "dog fish", "head dog", "brown yorkie",
                     "lab short bulldog"), label=1:14)

我想搜索name列,如果单词 出现猫",狮子",老虎"和豹",我想将字符串feline分配给新列和对应的行species.

I'd like to search the name column and if the words "cat", "lion", "tiger", and "panther" appear, I want to assign the character string feline to a new column and corresponding row species.

如果出现单词"bird", "eagle", and "sparrow",我想将字符串avian分配给新列和对应的行species.

If the words "bird", "eagle", and "sparrow" appear, I want to assign the character string avian to a new column and corresponding row species.

如果出现单词"dog","yorkie"和"bulldog",我想将字符串canine分配给新列和对应的行species.

If the words "dog", "yorkie", and "bulldog" appear, I want to assign the character string canine to a new column and corresponding row species.

理想情况下,我会将其存储在列表或可以在脚本开头保留的类似内容中,因为随着种类的新变体显示在名称类别中,可以轻松访问更新符合felineaviancanine的条件.

Ideally, I'd store this in a list or something similar that I can keep at the beginning of the script, because as new variants of the species show up in the name category, it would be nice to have easy access to update what qualifies as a feline, avian, and canine.

在这里几乎可以回答这个问题(

This question is almost answered here (How to create new column in dataframe based on partial string matching other column in R), but it doesn't address the multiple name twist that is present in this problem.

推荐答案

也许有比这更优雅的解决方案,但是您可以将grep|结合使用以指定其他匹配项.

There may be a more elegant solution than this, but you could use grep with | to specify alternative matches.

d[grep("cat|lion|tiger|panther", d$name), "species"] <- "feline"
d[grep("bird|eagle|sparrow", d$name), "species"] <- "avian"
d[grep("dog|yorkie", d$name), "species"] <- "canine"

我假设您的意思是禽鸟",而省略了牛头犬",因为它包含狗".

I've assumed you meant "avian", and left out "bulldog" since it contains "dog".

您可能想将ignore.case = TRUE添加到grep.

You might want to add ignore.case = TRUE to the grep.

输出:

#                 name label species
#1           brown cat     1  feline
#2            blue cat     2  feline
#3            big lion     3  feline
#4          tall tiger     4  feline
#5       black panther     5  feline
#6           short cat     6  feline
#7            red bird     7   avian
#8  short bird stuffed     8   avian
#9           big eagle     9   avian
#10        bad sparrow    10   avian
#11           dog fish    11  canine
#12           head dog    12  canine
#13       brown yorkie    13  canine
#14  lab short bulldog    14  canine

这篇关于R中的部分动物字符串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆