使用正则表达式替换匹配到位 [英] Using regex to replace matches in place
问题描述
我试图做某种类型的字符串扩展,其中我用数据库中的字符串替换键。标记的格式为 {$< key>}
。
I'm attempting to do a certain type of "string expansion", wherein I replace keys with strings from a database. The format of the tag is {$<key>}
.
我使用< regex>
尝试这样做,但我遇到一个后勤问题。我想能够在一次通过替换字符串,但修改字符串( s
)可以使在 smatch
对象。
I'm using <regex>
to try and get this done but I've run into a bit of a logistical problem. I want to be able to replace the strings in one pass, but modifying the string (s
) can invalidate the iterators found in the smatch
objects.
这或多或少是我想要做的:
Here is more or less what I'm trying to do:
#include <iostream>
#include <map>
#include <regex>
using namespace std;
int main()
{
map<string, string> m;
m.insert(make_pair("severity", "absolute"));
m.insert(make_pair("experience", "nightmare"));
string s = "This is an {$severity} {$experience}!";
static regex e("\\{\\$(.*?)\\}");
sregex_iterator next(s.begin(), s.end(), e);
sregex_iterator end;
for (; next != end; ++next)
{
auto m_itr = m.find(next->str(1));
if (m_itr == m.end())
{
continue;
}
//TODO: replace expansion tags with strings somehow?
cout << (*next).str(0) << ":" << m_itr->second << endl;
}
}
所需的最终结果是 s
读数:
The desired end result is that s
reads:
"This is an absolute nightmare!"
我知道我可以在多个回合中执行这种类型的东西,但这似乎有点粗暴。
I know that I could perform this type of thing in multiple passes, but that seems a bit brutish.
我在某处读到 boost :: regex
有一些变化 regex_replace
允许在此形式的自定义替换函数:
I read somewhere that boost::regex
had some variation of regex_replace
that allowed a custom replacement function in this form:
regex_replace(std::string&, regex, std::string(const smatch&))
但是,我当前的版本 )
没有这样的东西。
However, my current version (1.55)
has no such thing.
非常感谢任何帮助。
我可以使用 boost
或 std
,这取决于哪个工作!
P.S. I can use either boost
or std
for this, whichever works!
推荐答案
因此,除了我在8小时前所做的评论:
So, in addition to the comment I made 8 hours ago:
相关:使用Boost.Spirit编译简单的解析器 ,替换字符串,如何使用Boost扩展.ini文件中的环境变量有趣的是快速多重替换为字符串
我看到了一个方法的余地。如果...您需要根据相同的文本模板进行许多替换,但使用不同的替换贴图?
I saw room for one more approach. What if... you needed to do many many replacements based on the same text template, but using different replacement maps?
由于我最近发现了 Boost ICL可用于映射输入字符串的区域<
Since I've recently discovered how Boost ICL can be useful in mapping regions of input strings, I wanted to do the same here.
我做的东西很通用,并且使用了Spirit来做分析( study
):
I made things pretty generic, and employed Spirit to do the analyis (study
):
template <
typename InputRange,
typename It = typename boost::range_iterator<InputRange const>::type,
typename IntervalSet = boost::icl::interval_set<It> >
IntervalSet study(InputRange const& input) {
using std::begin;
using std::end;
It first(begin(input)), last(end(input));
using namespace boost::spirit::qi;
using boost::spirit::repository::qi::seek;
IntervalSet variables;
parse(first, last, *seek [ raw [ "{$" >> +alnum >> "}" ] ], variables);
return variables;
}
正如你所看到的,不是做任何替换, code> interval_set< It> ,所以我们知道我们的变量在哪里。这是现在的智慧,可以用于从替换字符串的映射执行替换:
As you can see, instead of doing any replacements, we just return a interval_set<It>
so we know where our variables are. This is now the "wisdom" that can be used to perform the replacements from a map of replacement strings:
template <
typename InputRange,
typename Replacements,
typename OutputIterator,
typename StudyMap,
typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, StudyMap const& wisdom, OutputIterator out)
{
using std::begin;
using std::end;
It current(begin(input));
for (auto& replace : wisdom)
{
It l(lower(replace)),
u(upper(replace));
if (current < l)
out = std::copy(current, l, out);
auto match = m.find({l+2, u-1});
if (match == m.end())
out = std::copy(l, u, out);
else
out = std::copy(begin(match->second), end(match->second), out);
current = u;
}
if (current!=end(input))
out = std::copy(current, end(input), out);
return out;
}
现在,一个简单的测试程序将是这样:
Now, a simple test program would be like this:
int main()
{
using namespace std;
string const input = "This {$oops} is an {$severity} {$experience}!\n";
auto const wisdom = study(input);
cout << "Wisdom: ";
for(auto& entry : wisdom)
cout << entry;
auto m = map<string, string> {
{ "severity", "absolute" },
{ "OOPS", "REALLY" },
{ "experience", "nightmare" },
};
ostreambuf_iterator<char> out(cout);
out = '\n';
perform_replacements(input, m, wisdom, out);
// now let's use a case insensitive map, still with the same "study"
map<string, string, ci_less> im { m.begin(), m.end() };
im["eXperience"] = "joy";
perform_replacements(input, im, wisdom, out);
}
列印
Wisdom: {$oops}{$severity}{$experience}
This {$oops} is an absolute nightmare!
This REALLY is an absolute joy!
您可以使用 unordered_map
替换等。你可以省略
wisdom
,在这种情况下,实现将在飞行中学习。
You could call it for a input string literal, using an unordered_map
for the replacements etc. You could omit the wisdom
, in which case the implementation will study it on-the-fly.
Live On Coliru
#include <iostream>
#include <map>
#include <boost/regex.hpp>
#include <boost/icl/interval_set.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
namespace boost { namespace spirit { namespace traits {
template <typename It>
struct assign_to_attribute_from_iterators<icl::discrete_interval<It>, It, void> {
template <typename ... T> static void call(It b, It e, icl::discrete_interval<It>& out) {
out = icl::discrete_interval<It>::right_open(b, e);
}
};
} } }
template <
typename InputRange,
typename It = typename boost::range_iterator<InputRange const>::type,
typename IntervalSet = boost::icl::interval_set<It> >
IntervalSet study(InputRange const& input) {
using std::begin;
using std::end;
It first(begin(input)), last(end(input));
using namespace boost::spirit::qi;
using boost::spirit::repository::qi::seek;
IntervalSet variables;
parse(first, last, *seek [ raw [ "{$" >> +alnum >> "}" ] ], variables);
return variables;
}
template <
typename InputRange,
typename Replacements,
typename OutputIterator,
typename StudyMap,
typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, StudyMap const& wisdom, OutputIterator out)
{
using std::begin;
using std::end;
It current(begin(input));
for (auto& replace : wisdom)
{
It l(lower(replace)),
u(upper(replace));
if (current < l)
out = std::copy(current, l, out);
auto match = m.find({l+2, u-1});
if (match == m.end())
out = std::copy(l, u, out);
else
out = std::copy(begin(match->second), end(match->second), out);
current = u;
}
if (current!=end(input))
out = std::copy(current, end(input), out);
return out;
}
template <
typename InputRange,
typename Replacements,
typename OutputIterator,
typename It = typename boost::range_iterator<InputRange const>::type
>
OutputIterator perform_replacements(InputRange const& input, Replacements const& m, OutputIterator out) {
return perform_replacements(input, m, study(input), out);
}
// for demo program
#include <boost/algorithm/string.hpp>
struct ci_less {
template <typename S>
bool operator() (S const& a, S const& b) const {
return boost::lexicographical_compare(a, b, boost::is_iless());
}
};
namespace boost { namespace icl {
template <typename It>
static inline std::ostream& operator<<(std::ostream& os, discrete_interval<It> const& i) {
return os << make_iterator_range(lower(i), upper(i));
}
} }
int main()
{
using namespace std;
string const input = "This {$oops} is an {$severity} {$experience}!\n";
auto const wisdom = study(input);
cout << "Wisdom: ";
for(auto& entry : wisdom)
cout << entry;
auto m = map<string, string> {
{ "severity", "absolute" },
{ "OOPS", "REALLY" },
{ "experience", "nightmare" },
};
ostreambuf_iterator<char> out(cout);
out = '\n';
perform_replacements(input, m, wisdom, out);
// now let's use a case insensitive map, still with the same "study"
map<string, string, ci_less> im { m.begin(), m.end() };
im["eXperience"] = "joy";
perform_replacements(input, im, wisdom, out);
}
In-place操作
只要您确保替换字符串总是比 {$ pattern}
字符串(或等长)短,您可以简单地调用此函数 input.begin()
作为输出迭代器。
In - place operation
As long as you make sure that the replacements strings are always shorter than the {$pattern}
strings (or equal length), you can simply call this function with input.begin()
as the output iterator.
string input1 = "This {$803525c8-3ce4-423a-ad25-cc19bbe8422a} is an {$efa72abf-fe96-4983-b373-a35f70551e06} {$8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef}!\n",
input2 = input1;
auto m = map<string, string> {
{ "efa72abf-fe96-4983-b373-a35f70551e06", "absolute" },
{ "803525C8-3CE4-423A-AD25-CC19BBE8422A", "REALLY" },
{ "8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef", "nightmare" },
};
input1.erase(perform_replacements(input1, m, input1.begin()), input1.end());
map<string, string, ci_less> im { m.begin(), m.end() };
im["8a10abaa-cc0d-47bd-a8e1-34a8aa0ec1ef"] = "joy";
input2.erase(perform_replacements(input2, im, input2.begin()), input2.end());
std::cout << input1
<< input2;
列印
This {$803525c8-3ce4-423a-ad25-cc19bbe8422a} is an absolute nightmare!
This REALLY is an absolute joy!
注意,你可以(显然)不要在同一个输入模板上重复使用相同的因为它将被修改。
Note that you can (obviously) not re-use the same "wisdom" on the same input template again because it will have been modified.
这篇关于使用正则表达式替换匹配到位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!