相同的正则表达式,但在仅Linux和Windows上的结果不同C ++ [英] the same regex but different results on Linux and Windows only C++
问题描述
我的命令行程序具有以下模式:
有人知道为什么它变得贪婪吗?
谢谢。
似乎有ab在5.4版中已修复的GCC中的ug。我的猜测是您在Windows设置上运行的是旧版本。
请在以下位置查看输出差异:
是否包含 boost
似乎没有什么区别。
该错误与(?! \\1)
有关,用<$ c代替$ c>(?![/])(在两种情况下)都可以解决此问题,但是显然,这会限制正则表达式与 / $ c $一起使用c>仅分隔符:
- 带有
(?![1])
的4.9版:否(正确)
此外,该Sim卡中出现了该错误请使用正则表达式:(。)((?! \\1)。)
,该输入应拒绝类似 aa $ c $的输入c>:
结论:确保安装GCC 5.4或更高版本。
I have this pattern for my command-line program:
^s?([/|@#])(?:(?!\1).)+\1(?:(?!\1).)*\1(?:(?:gi?|ig)?(?:\1\d\d?)?|i)?$
based on ECMAScript 262
for C++.
This is a special pattern to check if the user have entered a correct command or not.
It is a test against a string like this:
optional-s/one-or-more/anything/optional-g-or-i/optional-2-digits
Here is my previous question why I need this pattern.
Although it works fine on Linux,
but does not work on Windows. Also I know about line-break on the two machines and I have
read this: How are \n and \r handled differently on Linux and Windows?
My program does work with any files, it only gets the first argument of the command-line argv[ 1 ]
and the std::regex_match
tests if the entered-user-synopsis is correct or not.
Like: ./program 's/one/two/' *.txt
that simply renames one to two for all txt files
the C++ code:
std::string argv_1 = argv[ 1 ]; // => s/one/two/
bool rename_is_correct =
std::regex_match( argv_1, std::basic_regex< char >
( "s?([/|@#])(?:(?!\\1).)+\\1(?:(?!\\1).)*\\1(?:(?:gi?|ig)?(?:\\1-?[1-9]\\d?)?|i)?" ) );
The Problem:
Although the pattern is non-greedy; on Windows it becomes greedy and matches more then 4 delimiters. Therefore it should not match /one/two/three/four/five/
but this string is matched!
NOTE:
- I deliberately have dropped
^
and$
assertions since in the C++ regex thestd::regex_match
by default has them and it no need to use them - Also the two backslashes
\\
; one of them is escape character - javescript code says
no
const regex = /^s?([/|@#])(?:(?!\1).)+\1(?:(?!\1).)*\1((?:gi?|gi)\1-?[1-9]\d|i)?$/gm;
var str = 's/one/two/gi/-33/';
if( str.match( regex ) ){
console.log( "okay" );
} else {
console.log( "no" );
}
- Perl also says
no
, as you can see in the screenshot, but c++ saysokay
Does someone know why it becomes greedy?
Thanks.
There seems to have been a bug in GCC that got fixed in version 5.4. My guess is you are running an older version on your Windows set-up.
See the difference in output in:
- Version 4.9: "okey" (wrong)
- Version 5.4: "no" (right)
It does not seem to make a difference whether boost
is included or not.
The bug is related to (?!\\1)
, as replacing it by (?![/])
(in both instances) solves the issue, but obviously that would limit the regular expression for use with the /
delimiter only:
- Version 4.9 with
(?![1])
: "no" (correct)
Also, the bug appears with this simple regular expression: (.)((?!\\1).)
which should reject an input like aa
:
- Version 5.4: "no" (right)
- Version 4.9: "okey" (wrong)
Conclusion: make sure to install GCC version 5.4 or higher.
这篇关于相同的正则表达式,但在仅Linux和Windows上的结果不同C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!