带有增量测试的Javascript正则表达式库 [英] Javascript regex library with incremental testing
问题描述
我正在寻找一个JavaScript库(理想情况下是一个node.js包),它可以检查字符串是否以递增方式匹配正则表达式(即一次一个字符),并返回不确定的结果。例如,假设我有以下正则表达式:
I'm looking for a JavaScript library (ideally a node.js package) that can check if a string matches a regular expression incrementally (i.e. one character at a time), and return indeterminate results. For example, say I have the following regex:
j.*s.*
我想测试字符串javascript。我想要一个类似于以下的API:
And I want to test the string "javascript". I would like an API similar to the following:
var iregex = new IncrementalRegex('j.*s.*');
var matcher = iregex.createMatcher();
matcher.append('j');
matcher.test(); //returns "possible match"
matcher.append('a');
matcher.test(); //returns "possible match"
matcher.append('v'); matcher.append('a'); matcher.append('s');
matcher.test(); //returns "match found"
matcher.append('ript');
matcher.test(); //returns "match found"
然而,如果我测试了字符串foo,我会期待一些东西像这样:
Whereas if I tested the string "foo", I would expect something like this:
var matcher2 = iregex.createMatcher();
matcher.append('f');
matcher.test(); //returns "no match possible"
//At this point I wouldn't bother appending "oo" because I know that no match is possible.
编辑:
要清楚,append正在构建正在测试的字符串。一个新的匹配器开始测试空字符串,并在matcher.append('foo')之后它与foo匹配。 appendToString或buildUpString可能是更好用的名字。
To be clear, append is building up the string being tested. A new matcher starts out testing against the empty string, and after a matcher.append('foo') it matches against foo. appendToString or buildUpString might have been better names to use.
另外,我对如何做到这一点有一个想法,但我还没有完全考虑过它。也许有可能从原始正则表达式构建一个潜在匹配正则表达式,它将匹配字符串,当且仅当它们是原始正则表达式匹配的字符串的开头时。
Also, I have one idea of how this could potentially be done, but I haven't fully thought it through yet. Perhaps it is possible to build a "Potential match" regex from the original regex that will match strings if and only if they are the beginning of a string the original regex matches.
推荐答案
如果您的解析器规则只使用正确的正式语言正则表达式(即没有反向引用,前瞻或后观),您可以将它们转换为NFA(使用Thompson的构造等),然后推送每个角色都通过标准的双栈NFA模拟算法:如果角色没有过渡,你就得到了不;如果有一个并且你在当前状态集中有最终状态,那么你就得到了是;否则你就有了也许。
If your parser rules only use proper formal-language regular expressions (i.e. no backreferences, lookaheads or lookbehinds), you could translate them to NFAs (using Thompson's construction or the like) and then push each character through the standard two-stack NFA simulation algorithm: if there's no transition on the character, you've got "no"; if there is one and you've got a final state in your current state set, you've got "yes"; otherwise you've got "maybe".
这篇关于带有增量测试的Javascript正则表达式库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!