R - 用多个 id 替换多个模式 [英] R - Replace multiple patterns with multiple ids

查看:29
本文介绍了R - 用多个 id 替换多个模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这在其他帖子中已经部分解决,但不幸的是我无法让它在我这边正常运行.

This was partially already tackled in others posts but unfortunately I could not make it run properly on my side.

我有一个充满文本的数据框,我想用唯一的名称替换某些单词.

I have a data frame full of texts, and there are certain words that I want replaced by a unique name.

所以,如果我们看到下面的表格,我想替换每个词香蕉苹果番茄"用水果"这个词(水果这个词可以出现多次,没关系)我也想换鳕鱼猪肉牛肉"用动物"这个词

So, if we see the table bellow, I would want to replace each of the words "Banana Apple Tomato" by the word "Fruit" (the word Fruit can show up multiple times, that is ok) I also want to replace "Cod Pork Beef" by the word "Animals"

我有一个完整的 excel 文件,其中已完成此映射.excel 文件有两列.在 A 列上,我们有唯一的名称(如 Fruit 和 Animals).在 B 列中,我们有要替换文本中的单词(如 Banana、Apple、Tomato).

I have a full excel file where this mapping was done. The excel file has two columns. On column A, we have the unique name (like Fruit and Animals). On column B, we have the words that we want to replace on the text (Like Banana, Apple, Tomato).

我想出的代码是:

hous <- read.table(header = TRUE, 
                   stringsAsFactors = FALSE, 
                   text="HouseType HouseTypeNo
'Banana Apple Tomato Honey' 'Onion Garlic Pepper Sugar'
'Cod Pork Beef' 'Mushrooms Soya Eggs' ")

maps <- read.table(header = TRUE, 
                           stringsAsFactors = FALSE, 
                           text="UniqueID Names
'Fruit' 'Banana'
'Fruit' 'Apple'
'Fruit' 'Tomato'
'Animals' 'Cod'
'Animals' 'Pork'
'Animals' 'Beef'")

hous %>% str_replace_all( pattern = maps$Names, replacement = maps$UniqueID)
*#Warning message:
In stri_replace_all_regex(string, pattern, fix_replacement(replacement),  :
  argument is not an atomic vector; coercing*

我无法让它工作.我基本上只是想查找某些单词,并用一些独特的 ID 替换它们.听起来并不复杂,但我无法让它运行.

I cannot make it work. I basically just wanna look up for certain words, and replace them with some unique ids. It doesn't sound complicated, but I cannot make it run.

几点:在我的真实数据集中,我有数千个单词和 ID.我在其他例子中看到人们手工编写他们的 ID、模式和替换.就我而言,这不适用.

Just a few points: in my real data set I have thousands of words and IDs. I have seen in other examples people writing their ids, patters and replacements by hand. In my case that is not applicable.

最终输出将是这样的:

hous <- read.table(header = TRUE, 
                   stringsAsFactors = FALSE, 
                   text="HouseType HouseTypeNo
'Fruit Fruit Fruit Honey' 'Onion Garlic Pepper Sugar'
'Animal Animal Animal' 'Mushrooms Soya Eggs' ")

感谢任何帮助.

最好的问候

推荐答案

您可以创建一个命名列表并使用它来替换 str_replace_all 中的值:

You can create a named list and use it to replace values in str_replace_all :

hous$HouseType <- stringr::str_replace_all(hous$HouseType, 
                            setNames(maps$UniqueID, maps$Names))
hous

#                HouseType               HouseTypeNo
#1 Fruit Fruit Fruit Honey Onion Garlic Pepper Sugar
#2 Animals Animals Animals       Mushrooms Soya Eggs

这篇关于R - 用多个 id 替换多个模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆