如何使用gsub()完全替换字符串 [英] How do I replace the string exactly using gsub()

查看:237
本文介绍了如何使用gsub()完全替换字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个语料库: txt =微电子图案内的图案层." 我想将模式"一词完全替换为形式",我尝试编写代码:

I have a corpus: txt = "a patterned layer within a microelectronic pattern." I would like to replace the term "pattern" exactly by "form", I try to write a code:

txt_replaced = gsub("pattern","form",txt)

但是,在txt_replaced中响应的语料库是: 微电子形式中的已形成层."

However, the responsed corpus in txt_replaced is: "a formed layer within a microelectronic form."

如您所见,术语图案化"被错误地替换为形式化",因为图案化"中的特征部分与图案化"相匹配.

As you can see, the term "patterned" is wrongly replaced by "formed" because parts of characteristics in "patterned" matched to "pattern".

我想查询是否可以使用gsub()完全替换字符串? 也就是说,只应替换完全匹配的术语.

I would like to query that if I can replace the string exactly using gsub()? That is, only the term with exactly match should be replaced.

我渴望得到如下答复: 微电子形式中的图案化层."

I thirst for a responsed as below: "a patterned layer within a microelectronic form."

非常感谢!

推荐答案

正如@koshke所指出的,以前(我)已经回答了一个非常类似的问题. ...但是那是grep,这是gsub,所以我会再回答一次:

As @koshke noted, a very similar question has been answered before (by me). ...But that was grep and this is gsub, so I'll answer it again:

"\<"是单词开头的转义序列,而>"是结尾.在R字符串中,您需要将反斜杠加倍,所以:

"\<" is an escape sequence for the beginning of a word, and ">" is the end. In R strings you need to double the backslashes, so:

txt <- "a patterned layer within a microelectronic pattern."
txt_replaced <- gsub("\\<pattern\\>","form",txt)
txt_replaced
# [1] "a patterned layer within a microelectronic form."

或者,您可以使用\b代替\<\>. \b与单词边界匹配,因此可以在两端使用它>

Or, you could use \b instead of \< and \>. \b matches a word boundary so it can be used at both ends>

txt_replaced <- gsub("\\bpattern\\b","form",txt)

还请注意,如果您只想替换一次,请使用sub而不是gsub.

Also note that if you want to replace only ONE occurrence, you should use sub instead of gsub.

这篇关于如何使用gsub()完全替换字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆