在 Python 中用正则表达式匹配嵌套结构 [英] Matching Nested Structures With Regular Expressions in Python

查看:38
本文介绍了在 Python 中用正则表达式匹配嵌套结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎记得 DotNet 中的正则表达式有一种特殊的机制,可以正确匹配嵌套结构,例如( (a ( ( c ) b ) ) ( d ) e )<中的分组/代码>".

I seem to remember that Regular Expressions in DotNet have a special mechanism that allows for the correct matching of nested structures, like the grouping in "( (a ( ( c ) b ) ) ( d ) e )".

这个特性的 Python 等价物是什么?这可以通过一些解决方法使用正则表达式来实现吗?(尽管这似乎是当前正则表达式实现并非设计用于的那种问题)

What is the python equivalent of this feature? Can this be achieved using regular expressions with some workaround? (Though it seems to be the sort of problem that current implementations of regex aren't designed for)

推荐答案

通常使用 Python 正则表达式无法做到这一点.(.NET 正则表达式已扩展为平衡组",允许嵌套匹配.)

You can't do this generally using Python regular expressions. (.NET regular expressions have been extended with "balancing groups" which is what allows nested matches.)

然而,PyParsing 是处理这类事情的一个非常好的包:

However, PyParsing is a very nice package for this type of thing:

from pyparsing import nestedExpr

data = "( (a ( ( c ) b ) ) ( d ) e )"
print nestedExpr().parseString(data).asList()

输出为:

[[['a', [['c'], 'b']], ['d'], 'e']]

关于 PyParsing 的更多信息:

More on PyParsing:

这篇关于在 Python 中用正则表达式匹配嵌套结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆