使用正则表达式替换的地方比赛 [英] Using regex to replace matches in place

查看:144
本文介绍了使用正则表达式替换的地方比赛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图做一个某种类型的串片,其中我从数据库字符串替换键。标签的格式为 {$<密钥GT;}

I'm attempting to do a certain type of "string expansion", wherein I replace keys with strings from a database. The format of the tag is {$<key>}.

我用&LT;正则表达式&GT; ,试图得到这个工作,但我碰到一个有点后勤问题。我希望能够更换琴弦在一传,但修改字符串(取值),可无效中发现的迭代器 SMATCH 的对象。

I'm using <regex> to try and get this done but I've run into a bit of a logistical problem. I want to be able to replace the strings in one pass, but modifying the string (s) can invalidate the iterators found in the smatch objects.

下面或多或少什么,我试图做的:

Here is more or less what I'm trying to do:

#include <iostream>
#include <map>
#include <regex>

using namespace std;

int main()
{
    map<string, string> m;

    m.insert(make_pair("severity", "absolute"));
    m.insert(make_pair("experience", "nightmare"));

    string s = "This is an {$severity} {$experience}!";
    static regex e("\\{\\$(.*?)\\}");
    sregex_iterator next(s.begin(), s.end(), e);
    sregex_iterator end;

    for (; next != end; ++next)
    {
        auto m_itr = m.find(next->str(1));

        if (m_itr == m.end())
        {
            continue;
        }

        //TODO: replace expansion tags with strings somehow?

        cout << (*next).str(0) << ":" << m_itr->second << endl;
    }
}

期望的最终结果是,取值记载:

"This is an absolute nightmare!"

我知道我可以执行这种类型的多遍的事情,但似乎有点粗野。

I know that I could perform this type of thing in multiple passes, but that seems a bit brutish.

我读的地方,的boost ::正则表达式有一些变化 regex_replace 在这个允许自定义的替换功能形式:

I read somewhere that boost::regex had some variation of regex_replace that allowed a custom replacement function in this form:

regex_replace(std::string&, regex, std::string(const smatch&))

不过,我目前的版本(1.55)有没有这样的事情。

任何帮助是极大AP preciated!

Any help is greatly appreciated!

P.S。我可以使用升压 STD 对于这一点,任何作品!

P.S. I can use either boost or std for this, whichever works!

推荐答案

所以,除了我做了13小时前的注释:

So, in addition to the comment I made 8 hours ago:

也许相关:<一href=\"http://stackoverflow.com/questions/9404558/compiling-a-simple-parser-with-boost-spirit/9405546#9405546\">Compiling一个简单的解析与Boost.Spirit ,<一个href=\"http://stackoverflow.com/questions/17241897/replacing-pieces-of-string/17243219#17243219\">replacing串件,<一个href=\"http://stackoverflow.com/questions/17112494/how-to-expand-environment-variables-in-ini-files-using-boost/17126962#17126962\">How使用Boost 扩大在.ini文件的环境变量,也许是最有趣的是<一个href=\"http://stackoverflow.com/questions/22571578/fast-multi-replacement-into-string/22571753#22571753\">Fast多替换成字符串

perhaps related: Compiling a simple parser with Boost.Spirit, replacing pieces of string, How to expand environment variables in .ini files using Boost and perhaps most interestingly Fast multi-replacement into string

我看到房间多了一个途径。如果......你需要基于相同的文本模板做很多很多的替代品,但使用不同的置换贴图?

I saw room for one more approach. What if... you needed to do many many replacements based on the same text template, but using different replacement maps?

自从我最近发现如何<一个href=\"http://stackoverflow.com/questions/28397514/randomly-selecting-specific-subsequence-from-string/28401533?s=1|0.0000#28401533\">Boost ICL可以在输入字符串的区域映射有用,我想在这里做同样的。

Since I've recently discovered how Boost ICL can be useful in mapping regions of input strings, I wanted to do the same here.

我做的事情pretty通用的,聘用精神做思想探析(研究

I made things pretty generic, and employed Spirit to do the analyis (study):

template <
    typename InputRange,
    typename It = typename boost::range_iterator<InputRange const>::type,
    typename IntervalSet = boost::icl::interval_set<It> >
IntervalSet study(InputRange const& input) {
    using std::begin;
    using std::end;

    It first(begin(input)), last(end(input));

    using namespace boost::spirit::qi;
    using boost::spirit::repository::qi::seek;

    IntervalSet variables;

    parse(first, last, *seek [ raw [ "{$" >> +alnum >> "}" ] ], variables);

    return variables;
}

正如你所看到的,而不是做任何替换,我们只返回一个 interval_set&LT;它&GT; ,所以我们知道我们的变量。现在,这是一个可以用来从地图替换字符串的执行替换的智慧:

As you can see, instead of doing any replacements, we just return a interval_set<It> so we know where our variables are. This is now the "wisdom" that can be used to perform the replacements from a map of replacement strings:

template <
    typename InputRange,
    typename Replacements,
    typename OutputIterator,
    typename StudyMap,
    typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, StudyMap const& wisdom, OutputIterator out) 
{
    using std::begin;
    using std::end;

    It current(begin(input));

    for (auto& replace : wisdom)
    {
        It l(lower(replace)),
        u(upper(replace));

        if (current < l)
            out = std::copy(current, l, out);

        auto match = m.find({l+2, u-1});
        if (match == m.end())
            out = std::copy(l, u, out);
        else
            out = std::copy(begin(match->second), end(match->second), out);

        current = u;
    }

    if (current!=end(input))
        out = std::copy(current, end(input), out);
    return out;
}

现在,一个简单的测试程序是这样的:

Now, a simple test program would be like this:

int main()
{
    using namespace std;
    string const input = "This {$oops} is an {$severity} {$experience}!\n";
    auto const wisdom = study(input);

    cout << "Wisdom: ";
    for(auto& entry : wisdom)
        cout << entry;

    auto m = map<string, string> {
            { "severity",   "absolute"  },
            { "OOPS",       "REALLY"    },
            { "experience", "nightmare" },
        };

    ostreambuf_iterator<char> out(cout);
    out = '\n';

    perform_replacements(input, m, wisdom, out);

    // now let's use a case insensitive map, still with the same "study"
    map<string, string, ci_less> im { m.begin(), m.end() };
    im["eXperience"] = "joy";

    perform_replacements(input, im, wisdom, out);
}

打印

Wisdom: {$oops}{$severity}{$experience}
This {$oops} is an absolute nightmare!
This REALLY is an absolute joy!

您可以把它的输入字符串,使用 unordered_map 的替代品等,你可以省略智慧,在这种情况下实施,将研究它在即时。

You could call it for a input string literal, using an unordered_map for the replacements etc. You could omit the wisdom, in which case the implementation will study it on-the-fly.

<大骨节病> 住在Coliru

#include <iostream>
#include <map>
#include <boost/regex.hpp>
#include <boost/icl/interval_set.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>

namespace boost { namespace spirit { namespace traits {
    template <typename It>
        struct assign_to_attribute_from_iterators<icl::discrete_interval<It>, It, void> {
            template <typename ... T> static void call(It b, It e, icl::discrete_interval<It>& out) {
                out = icl::discrete_interval<It>::right_open(b, e);
            }
        };
} } }

template <
    typename InputRange,
    typename It = typename boost::range_iterator<InputRange const>::type,
    typename IntervalSet = boost::icl::interval_set<It> >
IntervalSet study(InputRange const& input) {
    using std::begin;
    using std::end;

    It first(begin(input)), last(end(input));

    using namespace boost::spirit::qi;
    using boost::spirit::repository::qi::seek;

    IntervalSet variables;    
    parse(first, last, *seek [ raw [ "{$" >> +alnum >> "}" ] ], variables);

    return variables;
}

template <
    typename InputRange,
    typename Replacements,
    typename OutputIterator,
    typename StudyMap,
    typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, StudyMap const& wisdom, OutputIterator out) 
{
    using std::begin;
    using std::end;

    It current(begin(input));

    for (auto& replace : wisdom)
    {
        It l(lower(replace)),
           u(upper(replace));

        if (current < l)
            out = std::copy(current, l, out);

        auto match = m.find({l+2, u-1});
        if (match == m.end())
            out = std::copy(l, u, out);
        else
            out = std::copy(begin(match->second), end(match->second), out);

        current = u;
    }

    if (current!=end(input))
        out = std::copy(current, end(input), out);
    return out;
}

template <
    typename InputRange,
    typename Replacements,
    typename OutputIterator,
    typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, OutputIterator out) {
    return perform_replacements(input, m, study(input), out);
}

// for demo program
#include <boost/algorithm/string.hpp>
struct ci_less {
    template <typename S>
    bool operator() (S const& a, S const& b) const {
        return boost::lexicographical_compare(a, b, boost::is_iless());
    }
};

namespace boost { namespace icl {
    template <typename It>
        static inline std::ostream& operator<<(std::ostream& os, discrete_interval<It> const& i) {
            return os << make_iterator_range(lower(i), upper(i));
        }
} }

int main()
{
    using namespace std;
    string const input = "This {$oops} is an {$severity} {$experience}!\n";
    auto const wisdom = study(input);

    cout << "Wisdom: ";
    for(auto& entry : wisdom)
        cout << entry;

    auto m = map<string, string> {
            { "severity",   "absolute"  },
            { "OOPS",       "REALLY"    },
            { "experience", "nightmare" },
        };

    ostreambuf_iterator<char> out(cout);
    out = '\n';

    perform_replacements(input, m, wisdom, out);

    // now let's use a case insensitive map, still with the same "study"
    map<string, string, ci_less> im { m.begin(), m.end() };
    im["eXperience"] = "joy";

    perform_replacements(input, im, wisdom, out);
}

在第 - 位操作

只要你确保该替换字符串总是比短 {$模式} 字符串(或等长),你可以简单地调用该函数 input.begin()作为输出迭代器。

In - place operation

As long as you make sure that the replacements strings are always shorter than the {$pattern} strings (or equal length), you can simply call this function with input.begin() as the output iterator.

<大骨节病> 住在Coliru

string input1 = "This {$803525c8-3ce4-423a-ad25-cc19bbe8422a} is an {$efa72abf-fe96-4983-b373-a35f70551e06} {$8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef}!\n",
       input2 = input1;

auto m = map<string, string> {
        { "efa72abf-fe96-4983-b373-a35f70551e06", "absolute"  },
        { "803525C8-3CE4-423A-AD25-CC19BBE8422A", "REALLY"    },
        { "8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef", "nightmare" },
    };

input1.erase(perform_replacements(input1, m, input1.begin()), input1.end());

map<string, string, ci_less> im { m.begin(), m.end() };
im["8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef"] = "joy";

input2.erase(perform_replacements(input2, im, input2.begin()), input2.end());

std::cout << input1
          << input2;

打印

This {$803525c8-3ce4-423a-ad25-cc19bbe8422a} is an absolute nightmare!
This REALLY is an absolute joy!

请注意,你可以(显然)不会再因为它会被修改再次使用同样的智慧相同的输入模板。

Note that you can (obviously) not re-use the same "wisdom" on the same input template again because it will have been modified.

这篇关于使用正则表达式替换的地方比赛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆