如何确保replaceAll将替换整个单词而不是subString [英] How to ensure replaceAll will replace a whole word and not a subString
问题描述
我输入了字典。迭代字典以从文本中的字典替换键
。但是 replaceAll
函数也会替换 subString
。
I have an input of dictionary. The dictionary is iterated over to replace the key
from dictionary in the text. But replaceAll
function replaces the subString
as well.
如何确保它匹配整个单词(整体而不是 subString
)
How to ensure that it will match the whole word (as a whole and not as a subString
)
String text= "Synthesis of 1-(2,6-dimethylbenzyl)-1H-indole-6-carboxylic acid [69-3] The titled compound (883 mg) sdvfshd[69-3]3456 as a white solid was prepared"
dictionary= {[69-3]=1-(2,6-dimethylbenzyl)-1H-indole-6-carboxylic acid }
for(Map.Entry<String, String> entry : dictionary.entrySet()){
text=text.replaceAll("\\b"+Pattern.quote(entry.getKey())+"\\b", entry.getValue());
}
推荐答案
replaceAll
将正则表达式作为参数。
replaceAll
takes as parameter a regular expression.
在正则表达式中,你有单词边界: \b
(使用 \ \b
在字符串文字中)。它们是确保您匹配单词而不是单词的一部分的最佳方式:\\bword\\b
In regular expressions, you have word boundaries : \b
(use \\b
in a string literal). They're the best way to ensure you're matching a word and not a part of a word : "\\bword\\b"
但在你的情况下,你不能使用单词边界,因为你没有找一个单词( [69-3]
不是一个字。)
But in your case, you can't use word boundaries as you're not looking for a word ([69-3]
isn't a word).
我建议:
text=text.replaceAll("(?=\\W+|^)"+Pattern.quote("[69-3]")+"(?=\\W+|$)", ...
想法是匹配字符串结尾或不是单词的东西。我不能确保这对您来说是正确的解决方案:必须在知道确切的完整用例的情况下调整这种模式。
The idea is to match a string end or something that's not a word. I can't ensure this will be the right solution for you though : such a pattern must be tuned knowing the exact complete use case.
请注意,如果所有密钥都遵循类似的模式可能是比迭代字典更好的解决方案,例如你可以使用像这样的模式(?= \\W + | ^)\\ [\\d + \ \-\\\\ + \\](?= \\W + | $)
。
Note that if all your keys follow a similar pattern there might be a better solution than to iterate through a dictionary, you might for example use a pattern like "(?=\\W+|^)\\[\\d+\\-\\d+\\](?=\\W+|$)"
.
这篇关于如何确保replaceAll将替换整个单词而不是subString的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!