可以使用正则表达式匹配嵌套模式吗? [英] Can regular expressions be used to match nested patterns?

查看:39
本文介绍了可以使用正则表达式匹配嵌套模式吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以编写一个正则表达式来匹配出现次数未知的嵌套模式?例如,当外大括号内嵌套了未知数量的左大括号时,正则表达式是否可以匹配左大括号和右大括号?

Is it possible to write a regular expression that matches a nested pattern that occurs an unknown number of times? For example, can a regular expression match an opening and closing brace when there are an unknown number of open/close braces nested within the outer braces?

例如:

public MyMethod()
{
  if (test)
  {
    // More { }
  }

  // More { }
} // End

应该匹配:

{
  if (test)
  {
    // More { }
  }

  // More { }
}

推荐答案

没有.就这么简单.有限自动机(它是正则表达式底层的数据结构)除了它所处的状态之外没有内存,如果你有任意深度的嵌套,你需要一个任意大的自动机,这与 的概念相冲突有限自动机.

No. It's that easy. A finite automaton (which is the data structure underlying a regular expression) does not have memory apart from the state it's in, and if you have arbitrarily deep nesting, you need an arbitrarily large automaton, which collides with the notion of a finite automaton.

您可以将嵌套/配对元素匹配到固定深度,其中深度仅受您的内存限制,因为自动机变得非常大.但是,在实践中,您应该使用下推自动机,即上下文无关文法的解析器,例如 LL(自顶向下)或 LR(自底向上).你必须考虑更糟糕的运行时行为:O(n^3) vs. O(n),n = length(input).

You can match nested/paired elements up to a fixed depth, where the depth is only limited by your memory, because the automaton gets very large. In practice, however, you should use a push-down automaton, i.e a parser for a context-free grammar, for instance LL (top-down) or LR (bottom-up). You have to take the worse runtime behavior into account: O(n^3) vs. O(n), with n = length(input).

有许多可用的解析器生成器,例如用于 Java 的 ANTLR.为 Java(或 C)寻找现有的语法也不难.
有关更多背景信息:自动机理论 维基百科

There are many parser generators avialable, for instance ANTLR for Java. Finding an existing grammar for Java (or C) is also not difficult.
For more background: Automata Theory at Wikipedia

这篇关于可以使用正则表达式匹配嵌套模式吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆