Lua模式匹配与正则表达式 [英] Lua pattern matching vs. regular expressions

查看:813
本文介绍了Lua模式匹配与正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习lua.关于lua中的模式匹配,我在lua.org的lua文档中找到了以下句子:

I'm currently learning lua. regarding pattern-matching in lua I found the following sentence in the lua documentation on lua.org:

尽管如此,Lua中的模式匹配是一个功能强大的工具,并且包含一些难以与标准POSIX实现匹配的功能.

Nevertheless, pattern matching in Lua is a powerful tool and includes some features that are difficult to match with standard POSIX implementations.

由于我熟悉posix正则表达式,所以我想知道是否有任何常见样本的lua模式匹配比正则表达式更好"-还是我误解了该句子?以及是否存在一些常见的示例:为什么模式匹配与正则表达式更适合?

As I'm familiar with posix regular expressions I would like to know if there are any common samples where lua pattern matching is "better" compared to regular expression -- or did I misinterpret the sentence? and if there are any common examples: why is any of pattern-matching vs. regular expressions better suited?

推荐答案

与正则表达式相比,lua模式匹配是否更常见"?

Are any common samples where lua pattern matching is "better" compared to regular expression?

并不是那么具体的示例,因为Lua模式具有比POSIX正则表达式更高的信噪比.总体设计通常是可取的,而不是特定的示例.

It is not so much particular examples as that Lua patterns have a higher signal-to-noise ratio than POSIX regular expressions. It is the overall design that is often preferable, not particular examples.

以下是一些有助于良好设计的因素:

Here are some factors that contribute to the good design:

  • 非常轻巧的语法,用于匹配常见字符类型,包括大写字母(%u),十进制数字(%d),空格字符(%s)等.可以使用相应的大写字母来补充任何字符类型,因此模式%S匹配任何非空格字符.

  • Very lightweight syntax for matching common character types including uppercase letters (%u), decimal digits (%d), space characters (%s) and so on. Any character type can be complemented by using the corresponding capital letter, so pattern %S matches any nonspace character.

报价非常简单和规则.引号字符是<​​c4>,因此它总是与字符串引号字符\不同,这使Lua模式比POSIX正则表达式(需要引号)更容易阅读.引用符号始终是安全的,而且也不必引用字母,因此您只需遵循经验法则即可,而不必记住什么符号是特殊的元字符.

Quoting is extremely simple and regular. The quoting character is %, so it is always distinct from the string-quoting character \, which makes Lua patterns much easier to read than POSIX regular expressions (when quoting is necessary). It is always safe to quote symbols, and it is never necessary to quote letters, so you can just go by that rule of thumb instead of memorizing what symbols are special metacharacters.

Lua提供捕获"功能,并且可以通过match调用返回多个捕获.该接口比通过副作用捕获子字符串或具有某些必须查询才能找到捕获的隐藏状态要好得多.捕获语法很简单:只需使用括号即可.

Lua offers "captures" and can return multiple captures as the result of a match call. This interface is much, much better than capturing substrings through side effects or having some hidden state that has to be interrogated to find captures. Capture syntax is simple: just use parentheses.

Lua具有最短匹配" -修饰符,可与最长匹配" *运算符一起使用.因此,例如s:find '%s(%S-)%.'会找到最短的非空格字符序列,该序列前面有空格,后面是一个点.

Lua has a "shortest match" - modifier to go along with the "longest match" * operator. So for example s:find '%s(%S-)%.' finds the shortest sequence of nonspace characters that is preceded by space and followed by a dot.

Lua模式的表达能力可与POSIX基本"正则表达式相媲美,而无需使用交替运算符|.您要放弃的是使用|的扩展"正则表达式.如果您需要那么多的表达能力,建议您一路使用 LPEG 基本上可以以相当合理的代价为您提供无上下文语法的强大功能.

The expressive power of Lua patterns is comparable to POSIX "basic" regular expressions, without the alternation operator |. What you are giving up is "extended" regular expressions with |. If you need that much expressive power I recommend going all the way to LPEG which gives you essentially the power of context-free grammars at quite reasonable cost.

这篇关于Lua模式匹配与正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆