有条件地替换字符串中的正则表达式匹配项 [英] Conditionally replace regex matches in string

查看:29
本文介绍了有条件地替换字符串中的正则表达式匹配项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用不同的替换模式替换字符串中的某些模式.

I am trying to replace certain patterns in a string with different replacement patters.

示例:

string test = "test replacing "these characters"";

我想要做的是将所有的 ' ' 替换为 '_',并将所有其他非字母或数字字符替换为空字符串.我创建了以下正则表达式,它似乎正确标记化,但我不确定如何(如果可能)使用 regex_replace 执行条件替换.

What I want to do is replace all ' ' with '_' and all other non letter or number characters with an empty string. I have the following regex created and it seems to tokenize correctly, but I am not sure how to (if possible) perform a conditional replace using regex_replace.

string test = "test replacing "these characters"";
regex reg("(\s+)|(\W+)");

替换后的预期结果是:

string result = "test_replacing_these_characters";

我不能使用 boost,这就是为什么我把它排除在标签之外.所以请不要回答包括提升.我必须用标准库来做到这一点.可能是不同的正则表达式可以实现目标,或者我只是坚持做两次.

I cannot use boost, which is why I left it out of the tags. So please no answer that includes boost. I have to do this with the standard library. It may be that a different regex would accomplish the goal or that I am just stuck doing two passes.

我不记得在我原来的正则表达式时 w 中包含了哪些字符,在查找之后我进一步简化了表达式.同样,目标是任何匹配 s+ 的内容都应替换为 '_',任何匹配的 W+ 均应替换为空字符串.

I did not remember what characters were included in w at the time of my original regex, after looking it up I have further simplified the expression. Again the goal is anything matching s+ should be replaced with '_' and anything matching W+ should be replaced with empty string.

推荐答案

C++ (0x, 11, tr1) 正则表达式 不要在每种情况下都确实有效(stackoverflow)(查找 此页面 上的短语 regex 用于 gcc),因此最好使用 boost 一段时间.

The c++ (0x, 11, tr1) regular expressions do not really work (stackoverflow) in every case (look up the phrase regex on this page for gcc), so it is better to use boost for a while.

你可以试试你的编译器是否支持所需的正则表达式:

You may try if your compiler supports the regular expressions needed:

#include <string>
#include <iostream>
#include <regex>

using namespace std;

int main(int argc, char * argv[]) {
    string test = "test replacing "these characters"";
    regex reg("[^\w]+");
    test = regex_replace(test, reg, "_");
    cout << test << endl;
}

以上适用于 Visual Studio 2012Rc.

The above works in Visual Studio 2012Rc.

编辑 1:要在一次传递中替换两个不同的字符串(取决于匹配),我认为这在这里不起作用.在 Perl 中,这可以在计算的替换表达式(/e 开关)中轻松完成.

Edit 1: To replace by two different strings in one pass (depending on the match), I'd think this won't work here. In Perl, this could easily be done within evaluated replacement expressions (/e switch).

因此,正如您已经怀疑的那样,您需要两次通过:

Therefore, you'll need two passes, as you already suspected:

 ...
 string test = "test replacing "these characters"";
 test = regex_replace(test, regex("\s+"), "_");
 test = regex_replace(test, regex("\W+"), "");
 ...

编辑 2:

如果可以在 regex_replace 中使用 回调函数 tr(),那么您可以修改那里的替换,例如:

If it would be possible to use a callback function tr() in regex_replace, then you could modify the substitution there, like:

 string output = regex_replace(test, regex("\s+|\W+"), tr);

tr() 做替换工作:

 string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }

问题就解决了.不幸的是,在某些 C++11 正则表达式实现中没有这样的重载,但是 Boost 有一个.以下将与 boost 一起使用并使用一次传递:

the problem would have been solved. Unfortunately, there's no such overload in some C++11 regex implementations, but Boost has one. The following would work with boost and use one pass:

...
#include <boost/regex.hpp>
using namespace boost;
...
string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }
...

string test = "test replacing "these characters"";
test = regex_replace(test, regex("\s+|\W+"), tr);   // <= works in Boost
...

也许有一天这将适用于 C++11 或接下来的任何数字.

Maybe some day this will work with C++11 or whatever number comes next.

问候

rbo

这篇关于有条件地替换字符串中的正则表达式匹配项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆