获取所有嵌套的花括号 [英] Get all nested curly braces

查看:48
本文介绍了获取所有嵌套的花括号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以从字符串中获取嵌套花括号中的所有内容?例如:

It is possible to get all content in nested curly braces from string? For example:

{quick} 棕色狐狸{跳过{跳过}懒惰的}狗

The {quick} brown fox {jumps {over the} lazy} dog

所以我需要:

  • 快速
  • 超过
  • 跳过{over the}懒惰

在这个序列中更好,从大多数嵌套开始.

Better in this sequence, from most nested.

推荐答案

解决方案

下面的正则表达式将允许您获取所有嵌套花括号的内容.请注意,这假设嵌套的花括号是平衡的;否则,很难定义答案应该是什么.

Solution

The regex below will allow you to grab the content of all the nested curly braces. Note that this assumes that the nested curly braces are balanced; otherwise, it is hard to define what the answer should be.

(?=\{((?:[^{}]++|\{(?1)\})++)\})

结果将在捕获组 1 中.

The result will be in capturing group 1.

演示

但是请注意,顺序与问题中指定的不一样.打印出的顺序由左大括号{的出现顺序定义,即最外面那对的内容会先打印出来.

Note that the order is not as specified in the question, though. The order printed out is defined by the order of appearance of opening curly bracket {, which means that the content of the outer most pair will be printed out first.

暂时忽略零宽度正向前瞻 (?=pattern),让我们关注里面的模式,即:

Ignoring the zero-width positive look-ahead (?=pattern) for now, and let us focus on the pattern inside, which is:

\{((?:[^{}]++|\{(?1)\})++)\}

两个文字花括号之间的部分 - ((?:[^{}]++|\{(?1)\})++) 将匹配 1 个或多个 任一实例:

The part between 2 literal curly braces - ((?:[^{}]++|\{(?1)\})++) will matches 1 or more instances of either:

  • 非空非大括号字符序列[^{}]++,或
  • 递归匹配由 {} 包围的块,该块可能包含许多其他非大括号序列或其他块.
  • a non-empty non-curly-brace sequence of characters [^{}]++, or
  • recursively match a block enclosed by {}, which may contain many other non-curly-brace sequences or other blocks.

仅上述模式就可以匹配不包含我们不需要的 {} 的文本.因此,我们确保匹配是由 {} 和一对花括号 {} 在 2 端括起来的块: \{((?:[^{}]++|\{(?1)\})++)\}.

The pattern above alone can match text that doesn't contain {}, which we don't need. Therefore, we make sure a match is a block enclosed by {} by the pair of curly braces {} at 2 ends: \{((?:[^{}]++|\{(?1)\})++)\}.

由于我们想要所有嵌套的花括号内的内容,我们需要防止引擎消耗文本.这就是使用零宽度正向预测的地方.

Since we want the content inside the all the nested curly braces, we need to prevent the engine from consuming the text. That's where the use of the zero-width positive look-ahead comes in to play.

它不是很有效,因为您将重做嵌套大括号的匹配,但我怀疑是否有任何其他通用解决方案使用正则表达式可以有效地处理它.

It is not very efficient since you will redo the match for the nesting braces, but I doubt there is any other general solution with regex that can handle it efficiently.

普通代码可以一次性高效处理所有内容,如果您将来要扩展您的需求,建议使用.

Normal code can handle everything efficiently in one pass, and is recommended if you are going to extend your requirement in the future.

这篇关于获取所有嵌套的花括号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆