正则表达式替换和放大器;#58;到":"等等 [英] Regex Replacing : to ":" etc
问题描述
我有一堆像字符串:
"Hello, here's a test colon:. Here's a test semi-colon;"
我想替换成
"Hello, here's a test colon:. Here's a test semi-colon;"
等了所有打印的ASCII值。
在present我使用的 的boost :: regex_search
匹配&放大器;#(\\ d +);
,建立一个字符串作为我处理依次每场比赛(含追加不含匹配,因为最后一场比赛,我发现子)。
At present I'm using boost::regex_search
to match &#(\d+);
, building up a string as I process each match in turn (including appending the substring containing no matches since the last match I found).
谁能想到这样做的更好的办法?我接受非正则表达式的方法,但似乎正则表达式在这种情况下,合理明智的做法。
Can anyone think of a better way of doing it? I'm open to non-regex methods, but regex seemed a reasonably sensible approach in this case.
谢谢,
大教堂
推荐答案
使用正则表达式的一大优点是处理棘手的情况下,像&放大器;#38;#38;
实体更换不重复,这是一个单一的步骤。正则表达式也将是相当有效:两个主角是固定的,所以它会快速跳过任何无法与&放大器开始;#
。最后,正则表达式的解决方案是一个没有很多的惊喜为将来的维护者。
The big advantage of using a regex is to deal with the tricky cases like &
Entity replacement isn't iterative, it's a single step. The regex is also going to be fairly efficient: the two lead characters are fixed, so it will quickly skip anything not starting with &#
. Finally, the regex solution is one without a lot of surprises for future maintainers.
我会说一个正则表达式是正确的选择。
I'd say a regex was the right choice.
这是最好的正则表达式,关系吗?你知道你需要两位数字,如果你有3个数字,第一个将是1.可打印的ASCII毕竟是&放大器;#32; - &安培;#126;
。出于这个原因,你可以考虑&放大器;#1 \\ D \\ D;
Is it the best regex, though? You know you need two digits and if you have 3 digits, the first one will be a 1. Printable ASCII is after all  -~
. For that reason, you could consider ?\d\d;
.
至于更换的内容,我会使用basic算法描述的boost ::正则表达式替换:: :
As for replacing the content, I'd use the basic algorithm described for boost::regex::replace :
For each match // Using regex_iterator<>
Print the prefix of the match
Remove the first 2 and last character of the match (&#;)
lexical_cast the result to int, then truncate to char and append.
Print the suffix of the last match.
这篇关于正则表达式替换和放大器;#58;到&QUOT;:&QUOT;等等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!