如何使用 LPeg 正确地进行前瞻 [英] How to do lookahead properly with LPeg

查看:26
本文介绍了如何使用 LPeg 正确地进行前瞻的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要匹配以 dog 开头,后跟 cat 的字符串(但不消耗 cat),这是有效的:

To match a string starting with dog, followed by cat(but not consuming cat), this works:

local lpeg = require 'lpeg'
local str1 = 'dogcat'
local patt1 = lpeg.C(lpeg.P('dog')) * #lpeg.P('cat')
print(lpeg.match(patt1, str1))

输出:dog

匹配一个以dog开头的字符串,后跟任何字符序列,再后跟cat(但不消耗它),就像正则表达式前瞻(dog.+?)(?=cat),我试过了:

To match a string starting with dog, followed with any character sequences, then followed by cat(but not consuming it), like the regex lookahead (dog.+?)(?=cat), I tried this:

local str2 = 'dog and cat'
local patt2 = lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1) * #lpeg.P("cat")
print(lpeg.match(patt2, str2))

我的预期结果是dog and,但它返回nil.

My expected result is dog and, but it returns nil.

如果我扔掉前瞻部分(即,使用模式 lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1)),它可以匹配整个字符串成功.这意味着 * lpeg.P(1) ^ 1 部分正确匹配任何字符序列,不是吗?

If I throws away the lookahead part (i.e, using the pattern lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1)), it can match the whole string successfully. This means * lpeg.P(1) ^ 1 part matches any character sequence correctly, isn't it?

如何解决?

推荐答案

需要在lookahead中每个可以匹配的位置否定cat":

You need to negate "cat" at each position in the lookahead that can match:

local patt2 = lpeg.C(lpeg.P"dog" * (lpeg.P(1)-lpeg.P"cat") ^ 1) * #lpeg.P"cat"

我认为插入我一直在研究的调试器是合适的(pegdebug),因为它在这种情况下有帮助.这是它为原始 lpeg 表达式生成的输出:

I think it's appropriate to plug the debugger I've been working on (pegdebug), as it helps in cases like this. Here is the output it generates for the original lpeg-expression:

+   Exp 1   "d"
 +  Dog 1   "d"
 =  Dog 1-3 "dog"
 +  Separator   4   " "
 =  Separator   4-11    " and cat"
 +  Cat 12  ""
 -  Cat 12
-   Exp 1

您可以看到 Separator 表达式吃掉"了所有字符,包括cat",并且没有任何东西可以与 P"cat" 匹配.

You can see that the Separator expression "eats" all the characters, including "cat" and there is nothing left to match against P"cat".

修改后的表达式的输出如下所示:

The output for the modified expression looks like this:

+   Exp 1   "d"
 +  Dog 1   "d"
 =  Dog 1-3 "dog"
 +  Separator   4   " "
 =  Separator   4-8 " and "
 +  Cat 9   "c"
 =  Cat 9-11    "cat"
=   Exp 1-8 "dog and "
/   Dog 1   0   
/   Separator   4   0   
/   Exp 1   1   "dog and "

这里是完整的脚本:

require 'lpeg'
local peg = require 'pegdebug'
local str2 = 'dog and cat'
local patt2 = lpeg.P(peg.trace { "Exp";
  Exp = lpeg.C(lpeg.V"Dog" * lpeg.V"Separator") * #lpeg.V"Cat";
  Cat = lpeg.P("cat");
  Dog = lpeg.P("dog");
  Separator = (lpeg.P(1) - lpeg.P("cat"))^1;
})
print(lpeg.match(patt2, str2))

这篇关于如何使用 LPeg 正确地进行前瞻的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆