如何获得平衡括号之间的表达式 [英] How to get an expression between balanced parentheses

查看:24
本文介绍了如何获得平衡括号之间的表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我得到以下类型的字符串:

Suppose I am given the following kind of string:

"(this is (haha) a string(()and it's sneaky)) ipsom (lorem) bla"

并且我想提取包含在最顶层括号中的子字符串.IE.我想获得字符串:这是(哈哈)一个字符串(()而且它很狡猾)"lorem".

and I want to extract substrings contained within a topmost layer of parentheses. I.e. I want to obtain the strings:"this is (haha) a string(()and it's sneaky)" and "lorem".

有没有一个很好的pythonic方法来做到这一点?正则表达式显然不能胜任这项任务,但也许有办法让 xml 解析器来完成这项工作?对于我的应用程序,我可以假设括号格式正确,即不是 (()(() 之类的东西.

Is there a nice pythonic method to do this? Regular expressions are not obviously up to this task, but maybe there is a way to get an xml parser to do the job? For my application I can assume the parentheses are well formed, i.e. not something like (()(().

推荐答案

这是堆栈的标准用例:您按字符读取字符串,每当遇到左括号时,将符号推入堆栈;如果遇到右括号,则从堆栈中弹出符号.

This is a standard use case for a stack: You read the string character-wise and whenever you encounter an opening parenthesis, you push the symbol to the stack; if you encounter a closing parenthesis, you pop the symbol from the stack.

由于你只有一种类型的括号,你实际上并不需要一个堆栈;相反,只需记住有多少个左括号就足够了.

Since you only have a single type of parentheses, you don’t actually need a stack; instead, it’s enough to just remember how many open parentheses there are.

此外,为了提取文本,我们还会记住第一级括号打开时部分从哪里开始,并在遇到匹配的右括号时收集结果字符串.

In addition, in order to extract the texts, we also remember where a part starts when a parenthesis on the first level opens and collect the resulting string when we encounter the matching closing parenthesis.

这可能看起来像这样:

string = "(this is (haha) a string(()and it's sneaky)) ipsom (lorem) bla"

stack = 0
startIndex = None
results = []

for i, c in enumerate(string):
    if c == '(':
        if stack == 0:
            startIndex = i + 1 # string to extract starts one index later

        # push to stack
        stack += 1
    elif c == ')':
        # pop stack
        stack -= 1

        if stack == 0:
            results.append(string[startIndex:i])

print(results)
# ["this is (haha) a string(()and it's sneaky)", 'lorem']

这篇关于如何获得平衡括号之间的表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆