TR1正则表达式:捕获组? [英] TR1 regex: capture groups?
问题描述
我使用 TR1正则表达式(适用于VS2010)和我想做的是搜索一个名为名称的组的特定模式,另一个模式为一个名为值的组。我想我想要的是一个捕获组,但我不知道这是否正确的术语。我想为模式[^:\r\\\
] +):\s分配匹配到名为name的匹配列表,以及匹配模式[^ \r\\\
] +)\r\\\
)+到名为value的匹配列表。
I am using TR1 Regular Expressions (for VS2010) and what I'm trying to do is search for specific pattern for a group called "name", and another pattern for a group called "value". I think what I want is called a capture group, but I'm not sure if that's the right terminology. I want to assign matches to the pattern "[^:\r\n]+):\s" to a list of matches called "name", and matches of the pattern "[^\r\n]+)\r\n)+" to a list of matches called "value".
到目前为止,正则表达式模式是
The regex pattern I have so far is
string pattern = "((?<name>[^:\r\n]+):\s(?<value>[^\r\n]+)\r\n)+";
但是正则表达式T4R1头在程序运行时不断抛出异常。我的模式的语法有什么问题?有人可以展示一个示例模式,它会做我想做的事吗?
But the regex T4R1 header keeps throwing an exception when the program runs. What's wrong with the syntax of the pattern I have? Can someone show an example pattern that would do what I'm trying to accomplish?
此外,如何在模式中包含一个子字符串来匹配,但是实际上不会在结果中包含该子字符串?例如,我想匹配模式的所有字符串
Also, how would it be possible to include a substring within the pattern to match, but not actually include that substring in the results? For example, I want to match all strings of the pattern
"http://[[:alpha:]]\r\n"
,但我不想包含子字符串http:// 。
, but I don't want to include the substring "http://" in the returned results of matches.
推荐答案
C ++ TR1和C ++ 11正则表达式语法不支持命名捕获组。您必须处理未命名的捕获组。
The C++ TR1 and C++11 regular expression grammars don't support named capture groups. You'll have to do unnamed capture groups.
此外,请确保您不会遇到转义问题。你必须转义一些字符两次:一个是在C ++字符串,另一个是在正则表达式。模式(([^:\r\\\
可以写成一个C ++字符串文字,如下所示:
] +):\s\s([^ \r\\\
] +)\r\\\
)+
Also, make sure you don't run into escaping issues. You'll have to escape some characters twice: one for being in a C++ string, and another for being in a regex. The pattern (([^:\r\n]+):\s\s([^\r\n]+)\r\n)+
can be written as a C++ string literal like this:
"([^:\\r\\n]+:\\s\\s([^\\r\\n]+)\\r\\n)+"
// or in C++11
R"xxx(([^:\r\n]+:\s\s([^\r\n]+)\r\n)+)xxx"
也不支持后备。你必须使用捕获组来解决这个限制:使用模式(http://)([[:alpha:]] \r\\\
并抓取第二个捕获组。
)
Lookbehinds are not supported either. You'll have to work around this limitation by using capture groups: use the pattern (http://)([[:alpha:]]\r\n)
and grab only the second capture group.
这篇关于TR1正则表达式:捕获组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!