正则表达式匹配特殊列表项 II [英] Regex to match special list items II

查看:35
本文介绍了正则表达式匹配特殊列表项 II的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个奇怪的项目列表和这样的列表,| 作为分隔符,{{ }} 作为括号.它看起来像这样:

I have weird list of items and lists like this with | as a delimiters and {{ }} as a parenthesis. It looks like this:

| item1 | item2 | item3 | {{Ulist1 | item4 | item5 | {{Ulist2 | item6 | item7 }} | item8 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14

我想使用 RegEx 匹配名为 Ulist*(第 4-8 项)的列表中的项目,并用 Uitem* 替换它们.结果应该是这样的:

I want to match items in lists called Ulist* (items 4-8) using RegEx and replace them with Uitem*. The result should look like this:

| item1 | item2 | item3 | {{Ulist1 | Uitem4 | Uitem5 | {{Ulist2 | Uitem6 | Uitem7 }} | Uitem8 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14

更新:

我根据这个问题尝试了一个解决方案,但答案来自如果 Ulist 中有一个 list ,那么这个问题是行不通的.它在 Python 2.7 中,具体来说我的代码是:

I tried a solution according to this question, but the answer from that question doesn't work, if there is a list inside of an Ulist. It is in Python 2.7, specifically my code is:

#!/usr/bin/python
# -*- coding: utf-8  -*-
import regex
def repl(m):
    return "".join([x.replace("item", "Uitem") if x.startswith("{{Ulist") else x for x in regex.split(r'\{{2}(?=(\blist\d*))[^\}]*(?:}(?!})[^\}]*)*}}', m.group(0))])
text = "| item1 | item2 | item3 | {{Ulist1 | item4 | item5 | {{Ulist2 | item6 | item7 }} | item8 | {{list4 | item15 | item16 }} | item17 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14"
rex = r'(\{\{(?=(Ulist\d*))(?>[^}{]|}(?!})|\{(?!\{)|(?1))*}})'
text = regex.sub(rex, repl, text)
print(text)

推荐答案

也许这可以让你开始:

def parse(data):
    items = [i.strip() for i in data.split('|')]
    newitems = []
    nest = [False]
    for item in items:
        if item.startswith('{{'):
            if item.startswith('{{Ulist'):
                nest.append(True)
            else:
                nest.append(False)
            newitems.append(item)
        else:
            if item.startswith('item') and nest[-1]:
                newitems.append('U' + item)
            else:
                newitems.append(item)
        if item.endswith('}}'):
            nest.pop()
    return ' | '.join(newitems)

基本上它在分隔符 (|) 上拆分数据并对它们执行单个循环,在适当的地方进行转换并将状态保存在名为 nest 的堆栈中以确定何时它应该转换.它假定分隔符周围的空格不重要.

Basically it splits the data on the delimiters (|) and does a single loop over them, converting where appropriate and keeping state in a stack called nest to determine when it should be converting. It assumes that whitespace surrounding delimiters isn't significant.

这篇关于正则表达式匹配特殊列表项 II的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆