可以使用正则表达式来匹配嵌套模式吗? [英] Can regular expressions be used to match nested patterns?

查看:184
本文介绍了可以使用正则表达式来匹配嵌套模式吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以编写与嵌套模式(发生次数未知)匹配的正则表达式?例如,当外括号内嵌套的打开/关闭括号的数量未知时,正则表达式可以匹配打开和关闭括号吗?

Is it possible to write a regular expression that matches a nested pattern that occurs an unknown number of times? For example, can a regular expression match an opening and closing brace when there are an unknown number of open/close braces nested within the outer braces?

例如:

public MyMethod()
{
  if (test)
  {
    // More { }
  }

  // More { }
} // End

应该匹配:

{
  if (test)
  {
    // More { }
  }

  // More { }
}

推荐答案

否.就这么简单.有限自动机(即正则表达式基础的数据结构)除了处于其所在的状态外没有其他内存,如果您具有任意深度的嵌套,则需要一个任意大的自动机,该自动机会与的概念相冲突有限的自动机.

No. It's that easy. A finite automaton (which is the data structure underlying a regular expression) does not have memory apart from the state it's in, and if you have arbitrarily deep nesting, you need an arbitrarily large automaton, which collides with the notion of a finite automaton.

您可以将嵌套/配对的元素匹配到固定深度,该深度仅受内存限制,因为自动机非常大.但是实际上,您应该使用下推式自动机,即用于上下文无关文法的解析器,例如LL(自上而下)或LR(自下而上).您必须考虑更糟糕的运行时行为:O(n ^ 3)与O(n),其中n = length(input).

You can match nested/paired elements up to a fixed depth, where the depth is only limited by your memory, because the automaton gets very large. In practice, however, you should use a push-down automaton, i.e a parser for a context-free grammar, for instance LL (top-down) or LR (bottom-up). You have to take the worse runtime behavior into account: O(n^3) vs. O(n), with n = length(input).

有许多可用的解析器生成器,例如Java的 ANTLR .查找Java(或C)的现有语法也不难.
有关更多背景信息:Wikipedia中的自动机理论

There are many parser generators avialable, for instance ANTLR for Java. Finding an existing grammar for Java (or C) is also not difficult.
For more background: Automata Theory at Wikipedia

这篇关于可以使用正则表达式来匹配嵌套模式吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆