为什么 pcre regex 比 c++11 regex 快得多 [英] Why pcre regex is much faster than c++11 regex

查看：99 发布时间：2021/6/14 20:45:43 c++ regex c++11 pcre

本文介绍了为什么 pcre regex 比 c++11 regex 快得多的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

一些示例代码.这是使用 cregex_iterator 的 c++11 部分:

Some sample code. This is the c++11 part using cregex_iterator:

std::chrono::steady_clock::time_point begin0 = std::chrono::steady_clock::now();
regex re("<option[\\s]value[\\s]*=[\\s]*\"([^\">]*)\"[\\s]*[^>]*>", regex::icase);
int found = 0;
for (std::cregex_iterator i = std::cregex_iterator(input, input + input_length, re);
i != std::cregex_iterator();
    ++i)
{
    found++;
    if (found < 10000) continue;
    break;
}
std::chrono::steady_clock::time_point end0 = std::chrono::steady_clock::now();

这是pcre部分.正则表达式都是一样的.

This is the pcre part. The regexp is all the same.

std::chrono::steady_clock::time_point begin4 = std::chrono::steady_clock::now();
const char *pError = NULL;
int errOffset;
int options = PCRE_MULTILINE | PCRE_CASELESS;
const char* regexp = "<option[\\s]value[\\s]*=[\\s]*\"([^\">]*)\"[\\s]*[^>]*>";
pcre* pPcre = pcre_compile(regexp, options, &pError, &errOffset, 0);                
int offset = 0;
int matches = -1;
int pMatches[6];
while (offset < input_length)
{
    matches = pcre_exec(pPcre,NULL, input, input_length, offset,0, pMatches,6); 
    if (matches >= 1)
    {
        found++;
        offset = pMatches[1];
        if (found < 10000) continue;
        break;  // find match
    }
    else
        offset = input_length;
}

std::chrono::steady_clock::time_point end4 = std::chrono::steady_clock::now();

结果显示 pcre 比 c++11 快 100 倍.我在 c++11 实现中发现了一些向量复制和调整大小.还有其他原因吗?

The result shows pcre is 100 times faster than c++11. I found some vector copy and resize in c++11 implementation. Are there some other reasons?

表面模式分析:

<option             # Subject pre-scan applied (unachored pattern)
    [\\s]
    value
    [\\s]*          # Auto-possessification applied (translates to \s*+)
    =
    [\\s]*          # //
    \"([^\">]*)\"   
    [\\s]*          # //
    [^>]*
>                   # Min length (17 chars) check of subject string applied

此外，如果输入字符串没有像>这样的特殊字符，则应该抛出快速失败.您应该知道性能也可能严重依赖于输入字符串.

Furthermore, if input string doesn't have a special character like >, a fast failure is supposed to be thrown. You should know that performance can depend on input string heavily as well.

在模式下运行:

(*NO_AUTO_POSSESS)(*NO_START_OPT)<option[\s]value[\s]*=[\s]*\"([^\">]*)\"[\s]*[^>]*>

在这个输入字符串上(观察那个时期):

over this input string (watch that period):

<option value                                                                 .

并比较结果(现场演示).

这篇关于为什么 pcre regex 比 c++11 regex 快得多的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么 pcre regex 比 c++11 regex 快得多 [英] Why pcre regex is much faster than c++11 regex

问题描述

推荐答案

表面模式分析:

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

为什么 pcre regex 比 c++11 regex 快得多 [英] Why pcre regex is much faster than c++11 regex

问题描述

推荐答案

表面模式分析:

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭