为什么这个正则表达式不会编译? [英] Why this regular expression will not compile?
问题描述
我想从这里使用正则表达式:
I would like to use regular expression from here:
https://tools.ietf.org/html/rfc3986#appendix-B
我想编译它这:
#include <regex.h>
...
regex_t regexp;
if((regcomp(®exp, "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?", REG_EXTENDED)) != 0){
return SOME_ERROR:
}
但我遇到了regcomp的返回值:
But I am stuck with return value of regcomp:
REG_BADRPT
根据 man 它意味着:
无效使用重复运算符,例如使用 *
作为第一个字符。
Invalid use of repetition operators such as using *
as the first character.
此的相似含义男子:
?
, *
或 +
前面没有有效的正则表达式
?
, *
or +
is not preceded by valid regular expression
我使用自己的正则表达式写了解析器,测试这个,因为它正式在rfc。我不打算使用它进行验证。
I wrote parser using my own regular expression, but I would like to test this one too, since its officially in rfc. I do no intend to use it for validation though.
推荐答案
正如Oli Charlesworth所建议的,您需要避开反斜杠 \\
为问号 \?
。有关详情,请参阅C ++ 转义序列。
As Oli Charlesworth suggested, you need to escape backslash \\
for the question marks \?
. See C++ escape sequences for more information.
测试程序
#include <regex.h>
#include <iostream>
void test_regcomp(char *rx){
regex_t regexp;
if((regcomp(®exp, rx, REG_EXTENDED)) != 0){
std::cout << "ERROR :" << rx <<"\n";
}
else{
std::cout << " OK :"<< rx <<"\n";
}
}
int main()
{
char *rx1 = "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?" ;
char *rx2 = "^(([^:/\?#]+):)\?(//([^/\?#]*))\?([^\?#]*)(\\\?([^#]*))\?(#(.*))\?" ;
test_regcomp(rx1);
test_regcomp(rx2);
return 0;
}
输出
ERROR :^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(?([^#]*))?(#(.*))?
OK :^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
在您的regex是REG_BADRPT错误的源。它会转换为?
。如果你替换它 \\?
,regcomp将能够编译你的正则表达式。
The \?
in your regex is the source of the REG_BADRPT error. It gets converted to ?
. If you replace it by \\?
, regcomp will be able to compile your regex.
"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?"
OK :^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
这篇关于为什么这个正则表达式不会编译?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!