原子团的清晰度 [英] Atomic groups clarity

查看:163
本文介绍了原子团的清晰度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑此正则表达式.

a*b

如果aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac

这需要调试器中的67步骤失败.

This takes 67 steps in debugger to fail.

现在考虑使用此正则表达式.

Now consider this regex.

(?>a*)b

如果aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac

这需要调试器中的133步骤失败.

This takes 133 steps in debugger to fail.

最后是这个正则表达式:

And lastly this regex:

a*+b  (a variant of atomic group)

如果aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac

这需要调试器中的67步骤失败.

This takes 67 steps in debugger to fail.

当我检查基准测试atomic group (?>a*)b时,执行179%的速度更快.

When I check the benchmark atomic group (?>a*)b performs 179% faster.

现在原子组禁用回溯.因此比赛表现不错.

Now atomic groups disable backtracking. So performance in match is good.

  1. 但是为什么步数更多?有人可以解释吗?

  1. But why are the number of steps more? Can somebody explain on this?

为什么有差异.在两个原子组(?>a*)ba*+b之间逐步进行.

Why is there a diff. in steps between two atomic groups (?>a*)b and a*+b.

它们的工作方式不同吗?

Do they work differently?

推荐答案

作者注:

该答案针对的是问题1,这是由悬赏文本传递的:我期待调试器需要更多步骤的确切原因.我不需要解释原子组如何工作的答案." ;
Jerry的答案很好地解决了其他问题,而

Author note:

    This answer targets question 1 as delivered by the bounty text "I am looking forward to the exact reason why more steps are being needed by the debugger.I dont need answers explaining how atomic groups work.";
    Jerry's answer addresses the other concerns very well, while my other answer takes a ride through the mentioned constructs, how they work, and why they are important. For full knowledge, simply reading this post is not enough!

正则表达式中的每个组都需要执行步骤才能进入和退出该组.

是什么?
是的,我很认真,请继续阅读...

Every group in a regular expression takes a step to step into and out of the group.

    WHAT?!
Yeah, I'm serious, read on...

首先,我想向您介绍量化的非捕获组,而没有该组:

Firstly, I would like to present you with quantified non-capturing groups, over without the group:

Pattern 1: (?:c)at
Pattern 2: cat

那么这里到底发生了什么?我们将在禁用优化的正则表达式引擎上将模式与测试字符串"concat"匹配:

So what exactly happens here? We'll match the patterns with the test string "concat" on a regex engine with optimizations disabled:

在讨论的同时,我还向您介绍一些小组:

While we're at it, I present you some more groups:

哦,不!我将避免使用群组!

    Oh no! I'm going to avoid using groups!

但请耐心等待!!请注意,要进行匹配的步骤数与比赛的性能没有相关. 引擎的问题可以优化大部分不必要的步骤" ;正如我所提到的. 尽管在禁用优化的引擎上采取了更多措施,但是原子组仍然是最高效的.

But wait! Please note that the number of steps taken to match has no correlation with the performance of the match. pcre engines optimizes away most of the "unnecessary steps" as I've mentioned. Atomic groups are still the most efficient, despite more steps taken on an engine with optimizations disabled.

可能相关:

这篇关于原子团的清晰度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆