std :: regex的行为不一致 [英] Inconsistent behavior of std::regex
问题描述
我遇到以下问题:
-
如果我传递
-
std::regex
的行为会有所不同.第一个匹配项将被截断,但后来被std::stoull
不接受(引发invalid_argument异常),而第二个匹配项将完美运行.
boost::filesystem::path::string()
的结果与将结果存储在中间字符串变量中,则std::regex
behaves differently if I pass the result ofboost::filesystem::path::string()
vs storing the result in a intermediate string variable. The first will return a match that is truncated and which is later not accepted bystd::stoull
(throws invalid_argument exception) while the second works perfectly.
请参阅以下命令,详细说明问题:
See the following commands that explain more the issue:
[nix-shell:~]$ ls -l foo
total 0
-rw-r--r-- 1 amine users 0 Aug 10 16:55 008
-rw-r--r-- 1 amine users 0 Aug 10 15:47 2530047398992289207
[nix-shell:~]$ cat test-1.cpp
#include <iostream>
#include <regex>
#include <string>
#include <boost/filesystem.hpp>
int main() {
std::regex expression{R"(([0-9]+))"};
boost::filesystem::path cacheDir("/home/amine/foo");
for (const auto& entry : boost::filesystem::directory_iterator{cacheDir})
{
std::smatch match;
auto result = std::regex_match(entry.path().filename().string(), match, expression);
std::cout << "Result: " << result << std::endl
<< "Length: " << match[1].length() << std::endl
<< "Match: " << match[1] << std::endl
<< "Filename: " << entry.path().filename().string() << std::endl
<< std::endl;
std::stoull(match[1], 0);
}
return 0;
}
[nix-shell:~]$ g++ -o test1 test-1.cpp -lboost_filesystem -O0 -g
[nix-shell:~]$ ./test1
Result: 1
Length: 19
Match: 98992289207
Filename: 2530047398992289207
terminate called after throwing an instance of 'std::invalid_argument'
what(): stoull
Aborted
[nix-shell:~]$ cat test-2.cpp
#include <iostream>
#include <regex>
#include <string>
#include <boost/filesystem.hpp>
int main() {
std::regex expression{R"(([0-9]+))"};
boost::filesystem::path cacheDir("/home/amine/foo");
for (const auto& entry : boost::filesystem::directory_iterator{cacheDir})
{
std::smatch match;
auto what = entry.path().filename().string();
auto result = std::regex_match(what, match, expression);
std::cout << "Result: " << result << std::endl
<< "Length: " << match[1].length() << std::endl
<< "Match: " << match[1] << std::endl
<< "Filename: " << entry.path().filename().string() << std::endl
<< std::endl;
std::stoull(match[1], 0);
}
return 0;
}
[nix-shell:~]$ g++ -o test2 test-2.cpp -lboost_filesystem -O0 -g
[nix-shell:~]$ ./test2
Result: 1
Length: 19
Match: 2530047398992289207
Filename: 2530047398992289207
Result: 1
Length: 3
Match: 008
Filename: 008
所以我的问题是:
- 为什么直接使用
boost::filesystem::path::string()
时std::regex
的结果被截断. - 并且我们假设,如果match变量中的结果被截断了,那会很好,为什么
std::stoull
会抛出异常呢?
- Why is the result of
std::regex
truncated when directly usingboost::filesystem::path::string()
. - And let's assume it's fine if the result in the match variable is truncated, why would
std::stoull
throw an exception with it?
推荐答案
很遗憾,您已经陷入陷阱.在C ++ 11中,您正在调用的std::regex_match
的重载为
You have unfortunately have fallen into a trap. In C++11 the overload of std::regex_match
you are calling is
template< class STraits, class SAlloc,
class Alloc, class CharT, class Traits >
bool regex_match( const std::basic_string<CharT,STraits,SAlloc>& s,
std::match_results<
typename std::basic_string<CharT,STraits,SAlloc>::const_iterator,
Alloc
>& m,
const std::basic_regex<CharT,Traits>& e,
std::regex_constants::match_flag_type flags =
std::regex_constants::match_default );
,由于它需要const&
到std::string
,因此可以将其传递给临时字符串.不幸的是,std::regex_match
不适用于临时字符串.这就是为什么您会发生意外行为的原因.您尝试引用超出范围的数据.
and since it takes a const&
to a std::string
you can pass it a temporary string. Unfortunately for you std::regex_match
is not designed to work with a temporary string. This is why you get unexpected behavior. You try to reference data that has gone out of scope.
C ++ 14通过添加来解决此问题
C++14 fixed this by adding
template< class STraits, class SAlloc,
class Alloc, class CharT, class Traits >
bool regex_match( const std::basic_string<CharT,STraits,SAlloc>&&,
std::match_results<
typename std::basic_string<CharT,STraits,SAlloc>::const_iterator,
Alloc
>&,
const std::basic_regex<CharT,Traits>&,
std::regex_constants::match_flag_type flags =
std::regex_constants::match_default ) = delete;
所以您不能再传递临时字符串.
so you could no longer pass a temporary string.
如果您不能使用C ++ 14,则需要确保您没有将临时字符串传递给std::regex_match
If you cannot use C++14 then you will need to make sure you do not pass a temporary string to std::regex_match
这篇关于std :: regex的行为不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!