从std :: regex提取原始正则表达式模式 [英] extracting original regex pattern from std::regex

查看:128
本文介绍了从std :: regex提取原始正则表达式模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数试图将给定的字符串与给定的正则表达式模式匹配。如果不匹配,则应创建一个指示这种情况的字符串,并包括失败的正则表达式模式和字符串的内容。类似于以下内容:

I have a function which is attempting to match a given string against a given regex pattern. If it does not match, it should create a string indicating such occurrence and include the regex pattern it failed and the content of the string. Something similar to such:

bool validate_content(const std::string & str, const std::regex & pattern, std::vector<std::string> & errors)
{
    if ( false == std::regex_match(str, pattern) )
    {
        std::stringstream error_str;
        // error_str << "Pattern match failure: " << pattern << ", content: " << str;
        errors.push_back(error_str.str());
        return false;
    }
    return true;
}

但是,您看到的注释行提出了一个挑战:是可以恢复正则表达式对象的原始模式吗?

However as you can see, the commented-out line presents a challenge: is it possible to recover the original pattern of the regex object?

显然,有一种解决方法是提供原始模式字符串(而不是或并排放置)regex对象,然后使用那。但是我当然希望不需要包括额外的工作,或者每次调用此函数时都要重新创建正则表达式对象(每次调用该函数时都要花费重新解析模式的费用),或者将regex模式与regex对象(除非我提供一个包装程序,否则容易产生输入错误和错误,这对我来说不太方便)。

There is obviously a workaround of providing the original pattern string (instead of or alongside) the regex object and then using that. But I would have of course preferred to not need to include the extra work to either recreate the regex object every time this function is called (biting cost in reparsing the pattern every time the function is called) or to pass the regex pattern along with the regex object (prone to typos and errors unless I provide a wrapper which does that for me, which is not as convenient).

我在GCC 4.9.2上使用Ubuntu 14.04。

I'm using GCC 4.9.2 on Ubuntu 14.04.

推荐答案

boost :: basic_regex 对象的 str()函数,该函数返回用于构造正则表达式的字符串(的副本)。 (它们还提供 begin() end()接口,这些接口将迭代器返回到字符序列,以及

boost::basic_regex objects have a str() function which returns a (copy of) the character string used to construct the regular expression. (They also provide begin() and end() interfaces which return iterators to the character sequence, as well as a mechanism for introspecting capture subexpressions.)

这些接口在最初的TR1 regex标准化建议中,但在采用 n1499:简化basic_regex中的接口,我引用:

These interfaces were in the initial TR1 regex standardization proposal, but were removed in 2003, after the adoption of n1499: Simplifying Interfaces in basic_regex, from which I quote:


basic_regex不应该保留其初始化程序的副本


basic_regex 模板具有一个成员函数 str 返回一个字符串对象,该对象包含用于初始化 basic_regex 对象的文本……有时可能是对于查看初始化程序字符串很有用,如果您不使用它,我们应该应用不付钱的规则。就像 fstream 对象不携带打开它们的文件名一样, basic_regex 对象也不应随身携带他们的初始值文字。如果有人需要跟踪该文本,则可以编写一个包含文本和 basic_regex 对象的类。

basic_regex Should Not Keep a Copy of its Initializer

The basic_regex template has a member function str which returns a string object that holds the text used to initialize the basic_regex object… While it might occasionally be useful to look at the initializer string, we ought to apply the rule that you don't pay for it if you don't use it. Just as fstream objects don't carry around the file name that they were opened with, basic_regex objects should not carry around their initializer text. If someone needs to keep track of that text they can write a class that holds the text and the basic_regex object.

这篇关于从std :: regex提取原始正则表达式模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆