如何使用LPeg正确进行超前 [英] How to do lookahead properly with LPeg
问题描述
要匹配以dog
开头的字符串,然后是cat
(但不消耗cat
),则可以这样做:
To match a string starting with dog
, followed by cat
(but not consuming cat
), this works:
local lpeg = require 'lpeg'
local str1 = 'dogcat'
local patt1 = lpeg.C(lpeg.P('dog')) * #lpeg.P('cat')
print(lpeg.match(patt1, str1))
输出:dog
要匹配以dog
开头的字符串,然后是任何字符序列,然后是cat
(但不使用它),就像正则表达式超前(dog.+?)(?=cat)
一样,我尝试了此操作:
To match a string starting with dog
, followed with any character sequences, then followed by cat
(but not consuming it), like the regex lookahead (dog.+?)(?=cat)
, I tried this:
local str2 = 'dog and cat'
local patt2 = lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1) * #lpeg.P("cat")
print(lpeg.match(patt2, str2))
我的预期结果是dog and
,但它返回nil
.
My expected result is dog and
, but it returns nil
.
如果我丢弃了超前部分(即使用模式lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1)
),它可以成功匹配整个字符串.这意味着* lpeg.P(1) ^ 1
部分可以正确匹配任何字符序列,不是吗?
If I throws away the lookahead part (i.e, using the pattern lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1)
), it can match the whole string successfully. This means * lpeg.P(1) ^ 1
part matches any character sequence correctly, isn't it?
如何解决?
推荐答案
您需要在前瞻中可以匹配的每个位置取反猫":
You need to negate "cat" at each position in the lookahead that can match:
local patt2 = lpeg.C(lpeg.P"dog" * (lpeg.P(1)-lpeg.P"cat") ^ 1) * #lpeg.P"cat"
我认为插入我一直在使用的调试器( pegdebug )是合适的在这种情况下有帮助.这是它为原始lpeg-expression生成的输出:
I think it's appropriate to plug the debugger I've been working on (pegdebug), as it helps in cases like this. Here is the output it generates for the original lpeg-expression:
+ Exp 1 "d"
+ Dog 1 "d"
= Dog 1-3 "dog"
+ Separator 4 " "
= Separator 4-11 " and cat"
+ Cat 12 ""
- Cat 12
- Exp 1
您会看到Separator表达式吃掉"了所有字符,包括"cat",并且没有任何内容可以与P"cat"
匹配.
You can see that the Separator expression "eats" all the characters, including "cat" and there is nothing left to match against P"cat"
.
修改后的表达式的输出如下:
The output for the modified expression looks like this:
+ Exp 1 "d"
+ Dog 1 "d"
= Dog 1-3 "dog"
+ Separator 4 " "
= Separator 4-8 " and "
+ Cat 9 "c"
= Cat 9-11 "cat"
= Exp 1-8 "dog and "
/ Dog 1 0
/ Separator 4 0
/ Exp 1 1 "dog and "
这是完整的脚本:
require 'lpeg'
local peg = require 'pegdebug'
local str2 = 'dog and cat'
local patt2 = lpeg.P(peg.trace { "Exp";
Exp = lpeg.C(lpeg.V"Dog" * lpeg.V"Separator") * #lpeg.V"Cat";
Cat = lpeg.P("cat");
Dog = lpeg.P("dog");
Separator = (lpeg.P(1) - lpeg.P("cat"))^1;
})
print(lpeg.match(patt2, str2))
这篇关于如何使用LPeg正确进行超前的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!