在C ++中寻找搜索和替换的圣杯 [英] Searching for Holy Grail of search and replace in C++

查看:77
本文介绍了在C ++中寻找搜索和替换的圣杯的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我一直在寻找一种替换字符串中的标记的方法,该方法本质上是查找并替换(但至少有一种其他方法可以解决该问题),并且看起来很平凡.我提供了几种可能的实现,但是从性能的角度来看,它们都不令人满意.最佳成就是每次迭代约50us.这种情况是理想的,字符串的大小从不增长,起初我忽略了不区分大小写的要求
这是 Coliru
上的代码
我的机器上的结果:
Boost.Spirit符号结果:3421?= 3421
100000个周期耗时6060ms.
Boyer-Moore结果:3421?= 3421
100000个周期耗时5959ms.
Boyer Moore Hospool结果:3421?= 3421
100000个周期耗时5008ms.
Knuth Morris Pratt结果:3421?= 3421
100000个周期耗时12451ms.
天真的STL搜索和替换结果:3421?= 3421
100000个周期耗时5532ms.
提高replace_all结果:3421?= 3421
100000个周期耗时4860ms.

Recently I was looking for a way to replace tokens in string which is essentially find and replace (but there is at least one additional approach to the problem) and looks like quite banal task. I've came with couple of possible implementations but none of them was satisfying from performance point of view. Best achievement was ~50us per iteration. The case was ideal, the string was never growing in size and initially I've omitted the requirement to be case insensitive
Here is the code at Coliru

Results from my machine:
Boost.Spirit symbols result: 3421?=3421
100000 cycles took 6060ms.
Boyer-Moore results:3421?=3421
100000 cycles took 5959ms.
Boyer Moore Hospool result:3421?=3421
100000 cycles took 5008ms.
Knuth Morris Pratt result: 3421?=3421
100000 cycles took 12451ms.
Naive STL search and replace result: 3421?=3421
100000 cycles took 5532ms.
Boost replace_all result:3421?=3421
100000 cycles took 4860ms.

问题是,在这么简单的任务中需要花费这么长时间吗?可以说,好,简单的任务,继续执行并更好地执行它.但是现实是15岁的MFC天真的实现可以更快地完成任务数量级:

So the question, what takes so long in such a simple task? One can say, ok, simple task, go ahead and implement it better. But the reality is that 15 years old MFC naive implementation does the task order of magnitude faster:

CString FillTokenParams(const CString& input, const std::unordered_map<std::string, std::string>& tokens)
{
    CString tmpInput = input;
    for(const auto& token : tokens)
    {
        int pos = 0;
        while(pos != -1)
        {
            pos = tmpInput.Find(token.first.c_str(), pos);
            if(pos != -1)
            {
                int tokenLength = token.first.size();
                tmpInput.Delete(pos, tokenLength);
                tmpInput.Insert(pos, token.second.c_str());
                pos += 1;
            }
        }
    }

    return tmpInput;
}

结果:
MFC天真搜索并替换结果:3421?= 3421
100000个周期耗时516ms.
这个笨拙的代码为何胜过现代C ++ ???为什么其他实现如此缓慢?我缺少基本的东西吗?

result:
MFC naive search and replace result:3421?=3421
100000 cycles took 516ms.
How come this clumsy code outperforms modern C++??? why other implementations were so slow? Am I missing something fundamental?

EDIT001:我已经在此问题上进行了投入,对代码进行了配置文件并进行了三重检查.您可能对此不满意,但是std :: string :: replace并不是花时间.在任何STL实施中,搜索都是花费大部分时间的事情,boost spirit在分配tst(我估计评估树中的测试节点)上浪费了时间.我不希望有人在函数中指向这是您的问题"和po的一行,问题就不存在了.问题是MFC如何以10倍的速度完成相同的工作.

EDIT001: I've invested in this question, the code been profiled and triple checked. You can be unsatisfied with this and that, but std::string::replace is not what taking time. In any STL implementation search is what taking most of the time, boost spirit wastes time on allocation of tst (test node in evaluation tree I guess). I dont expect anyone pointing on a line in a function with "this is your problem" and poof, the problem is gone. The question is how did MFC manages to do the same work 10 times faster.

EDIT002:只是深入研究了Find的MFC实现,并编写了一个模仿MFC实现的函数

EDIT002: Just drilled down into MFC implementation of Find and wrote a function which mimics the MFC implementation

namespace mfc
{
std::string::size_type Find(const std::string& input, const std::string& subString, std::string::size_type start)
{
    if(subString.empty())
    {
        return std::string::npos;
    }

    if(start < 0 || start > input.size())
    {
        return std::string::npos;
    }

    auto found = strstr(input.c_str() + start, subString.c_str());
    return ((found == nullptr) ? std::string::npos : std::string::size_type(found - input.c_str()));
}
}

std::string MFCMimicking(const std::string& input, const std::unordered_map<std::string, std::string>& tokens)
{
    auto tmpInput = input;
    for(const auto& token : tokens)
    {
        auto pos = 0;
        while(pos != std::string::npos)
        {
            pos = mfc::Find(tmpInput, token.first, pos);
            if(pos != std::string::npos)
            {
                auto tokenLength = token.first.size();
                tmpInput.replace(pos, tokenLength, token.second.c_str());
                pos += 1;
            }
        }
    }

    return tmpInput;
}

结果:
MFC模拟展开结果:3421?= 3421
100000个周期耗时411ms.
含义4us.每次通话,都要击败C strstr

Results:
MFC mimicking expand result:3421?=3421
100000 cycles took 411ms.
Meaning 4us. per call, go beat that C strstr

EDIT003:编译并使用-Ox

EDIT003: Compiling and running with -Ox


MFC模仿展开结果:3421?= 3421
100000个周期耗时660毫秒.
MFC 天真的搜索和替换结果:3421?= 3421
100000个周期耗时856ms.
手动扩展结果:3421?= 3421
100000个周期用了1995ms.布尔·摩尔(Boyer-Moore) 结果:3421?= 3421
100000个周期花费了6911ms.
Boyer Moore Hospool 结果:3421?= 3421
100000个周期耗时5670ms.
Knuth Morris Pratt 结果:3421?= 3421
100000个周期用了13825ms.
天真的STL搜索和 替换结果:3421?= 3421
100000个周期用了9531ms.
助推器 replace_all结果:3421?= 3421
100000个周期用了8996ms.


MFC mimicking expand result:3421?=3421
100000 cycles took 660ms.
MFC naive search and replace result:3421?=3421
100000 cycles took 856ms.
Manual expand result:3421?=3421
100000 cycles took 1995ms.
Boyer-Moore results:3421?=3421
100000 cycles took 6911ms.
Boyer Moore Hospool result:3421?=3421
100000 cycles took 5670ms.
Knuth Morris Pratt result: 3421?=3421
100000 cycles took 13825ms.
Naive STL search and replace result: 3421?=3421
100000 cycles took 9531ms.
Boost replace_all result:3421?=3421
100000 cycles took 8996ms.


以-O2运行(与原始测量一样),但周期为1万次


running with -O2 (as in original measurements) but 10k cycles


MFC模拟扩展结果:3421?= 3421
花费了10000个周期 104毫秒.
MFC天真搜索和替换结果:3421?= 3421
10000 周期耗时105ms.
手动扩展结果:3421?= 3421 <10000> 周期耗时356毫秒.
Boyer-Moore结果:3421?= 3421
10000个周期 用了1355ms.
Boyer Moore Hospool结果:3421?= 3421
10000 周期花费了1101ms.
Knuth Morris Pratt结果:3421?= 3421
10000个周期耗时1973ms.
天真的STL搜索和替换结果: 3421?= 3421
10000个周期耗时923ms.
提升replace_all 结果:3421?= 3421
10000个周期用了880ms.


MFC mimicking expand result:3421?=3421
10000 cycles took 104ms.
MFC naive search and replace result:3421?=3421
10000 cycles took 105ms.
Manual expand result:3421?=3421
10000 cycles took 356ms.
Boyer-Moore results:3421?=3421
10000 cycles took 1355ms.
Boyer Moore Hospool result:3421?=3421
10000 cycles took 1101ms.
Knuth Morris Pratt result: 3421?=3421
10000 cycles took 1973ms.
Naive STL search and replace result: 3421?=3421
10000 cycles took 923ms.
Boost replace_all result:3421?=3421
10000 cycles took 880ms.

推荐答案

好吧,这将是一个漫长的故事.只是为了提醒您所问的问题.

Ok, it will be a long story. Just to remind you questions asked.

  1. 为什么使用C ++(各种方法)进行搜索和替换的速度如此之慢?
  2. 为什么MFC搜索和替换如此之快?

令人惊讶的是,两个问题都具有相同的答案.由于C ++的开销. 是的.我们闪亮的现代C ++有一些开销,我们通常会忽略这些开销 灵活性和优雅.

Surprisingly, both questions have the same answer. Because of C++ overhead. Yep. Our shiny modern C++ has an overhead which mostly we dismiss for the flexibility and elegance.

但是,当涉及亚微秒分辨率时(不是C ++不是 能够以纳秒级的分辨率处理事务)的开销变得更大 突出.

However, when it comes to sub-microsecond resolutions (not that C++ is not capable of doing things in nanosecond resolutions) the overhead becomes more prominent.

让我用与问题中相同的代码进行展示,但更多 与每个功能中完成的工作保持一致.

Let me showcase with the same code I've posted in the question, but it is more coherent with things done in each function.

在Coliru上直播

它使用前面提到的Nonius(感谢@sehe),交互式结果为此处.

It uses aforementioned Nonius (thanks to @sehe) and interactive results are here.

您可以单击图例以显示/隐藏特定系列.

结论

有两个出色的结果

Conclusions

There are two outstanding results

  • MFC模拟功能和
  • 我自己的人工替换

这些函数至少比其他函数快一个数量级,那么有什么区别呢?

These functions at least one order of magnitude are faster than the rest, so what is the difference?

所有这些慢速"函数都是在用C语言编写快速文件时用C ++语言编写的(不是纯C语言,当输出增大时,我懒得处理输出缓冲区的malloc/realloc).好吧,我想很明显,有时候别无选择,只能求助于纯C.出于安全原因和缺乏类型安全性,我个人反对使用C.此外,编写高质量的C代码只需要更多的专业知识和关注.

All these "slow" functions are written in C++ when the fast is written in C (not pure C, I was too lazy to deal with malloc/realloc of output buffers when the output grows in size). Well, I guess it is clear that sometimes there is no choice but resort to pure C. I personally against using C for security reasons and lack of type safety. In addition it simply requires more expertise and attention to write high quality C code.

我暂时不会将其标记为答案,等待对此结论发表评论.

I will not mark it as an answer for a while, waiting for comments on this conclusion.

我要感谢所有积极参与讨论,提出想法并指出示例中不一致之处的人.

I want to thank all those who actively participated in the discussion, raised ideas and pointed out inconsistencies in my example.

2019年更新:
只是为了保留代码: https://github.com/kreuzerkrieg/string_search_replace
Nonius结果: https://kreuzerkrieg.github.io/string_search_replace/

2019 update:
Just to preserve the code: https://github.com/kreuzerkrieg/string_search_replace
Nonius results: https://kreuzerkrieg.github.io/string_search_replace/

在Ubuntu 18.04上用gcc-9跑

Ran with gcc-9 on Ubuntu 18.04

这篇关于在C ++中寻找搜索和替换的圣杯的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆