如何在Go中匹配正则表达式和反向引用? [英] How to match a regex with backreference in Go?

查看:1017
本文介绍了如何在Go中匹配正则表达式和反向引用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要匹配在我的Go代码中使用反向引用的正则表达式(例如\ 1)。



这并不容易,因为在Go中,正式正则表达式包使用 RE2引擎,其中一个选择不支持反向引用(以及其他一些鲜为人知的功能),以保证线性时间执行,因此避免正则表达式拒绝服务攻击。启用反向引用支持不是RE2的选项。



在我的代码中,攻击者没有被恶意利用的风险,我需要反向引用。



我该怎么办? 正则表达式非常适合使用正则语法,但如果您的语法不规则(即需要后向引用等),您应该切换到更好的工具。有很多很好的工具可用于解析上下文无关语法,包括随附的 yacc 去分配默认。或者,您也可以编写自己的解析器。例如,递归下降解析器可以很容易地编写。



我认为正则表达式在脚本语言(如Perl,Python,Ruby,...)中被过度使用,因为它们的C / ASM支持的实现通常比这些语言本身更优化,但Go不是这样的语言。正则表达式通常很慢,通常不适合这个问题。


I need to match a regex that uses backreferences (e.g. \1) in my Go code.

That's not so easy because in Go, the official regexp package uses the RE2 engine, one that have chosen to not support backreferences (and some other lesser-known features) so that there can be a guarantee of linear-time execution, therefore avoiding regex denial-of-service attacks. Enabling backreferences support is not an option with RE2.

In my code, there is no risk of malicious exploitation by attackers, and I need backreferences.

What should I do?

解决方案

Regular Expressions are great for working with regular grammars, but if your grammar isn't regular (i.e. requires back-references and stuff like that) you should probably switch to a better tool. There are a lot of good tools available for parsing context-free grammars, including yacc which is shipped with the Go distribution by default. Alternatively, you can also write your own parser. Recursive descent parsers can be easily written by hand for example.

I think regular expressions are overused in scripting languages (like Perl, Python, Ruby, ...) because their C/ASM powered implementation is usually more optimized than those languages itself, but Go isn't such a language. Regular expressions are usually quite slow and are often not suited for the problem at all.

这篇关于如何在Go中匹配正则表达式和反向引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆