正则表达式匹配特殊列表项 II [英] Regex to match special list items II
问题描述
我有一个奇怪的项目列表和这样的列表,|
作为分隔符,{{ }}
作为括号.它看起来像这样:
I have weird list of items and lists like this with |
as a delimiters and {{ }}
as a parenthesis. It looks like this:
| item1 | item2 | item3 | {{Ulist1 | item4 | item5 | {{Ulist2 | item6 | item7 }} | item8 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14
我想使用 RegEx 匹配名为 Ulist*
(第 4-8 项)的列表中的项目,并用 Uitem*
替换它们.结果应该是这样的:
I want to match items in lists called Ulist*
(items 4-8) using RegEx and replace them with Uitem*
. The result should look like this:
| item1 | item2 | item3 | {{Ulist1 | Uitem4 | Uitem5 | {{Ulist2 | Uitem6 | Uitem7 }} | Uitem8 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14
更新:
我根据这个问题尝试了一个解决方案,但答案来自如果 Ulist
中有一个 list
,那么这个问题是行不通的.它在 Python 2.7 中,具体来说我的代码是:
I tried a solution according to this question, but the answer from that question doesn't work, if there is a list
inside of an Ulist
. It is in Python 2.7, specifically my code is:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import regex
def repl(m):
return "".join([x.replace("item", "Uitem") if x.startswith("{{Ulist") else x for x in regex.split(r'\{{2}(?=(\blist\d*))[^\}]*(?:}(?!})[^\}]*)*}}', m.group(0))])
text = "| item1 | item2 | item3 | {{Ulist1 | item4 | item5 | {{Ulist2 | item6 | item7 }} | item8 | {{list4 | item15 | item16 }} | item17 }} | item9 | {{list3 | item10 | item11 | item12 }} | item13 | item14"
rex = r'(\{\{(?=(Ulist\d*))(?>[^}{]|}(?!})|\{(?!\{)|(?1))*}})'
text = regex.sub(rex, repl, text)
print(text)
推荐答案
也许这可以让你开始:
def parse(data):
items = [i.strip() for i in data.split('|')]
newitems = []
nest = [False]
for item in items:
if item.startswith('{{'):
if item.startswith('{{Ulist'):
nest.append(True)
else:
nest.append(False)
newitems.append(item)
else:
if item.startswith('item') and nest[-1]:
newitems.append('U' + item)
else:
newitems.append(item)
if item.endswith('}}'):
nest.pop()
return ' | '.join(newitems)
基本上它在分隔符 (|
) 上拆分数据并对它们执行单个循环,在适当的地方进行转换并将状态保存在名为 nest
的堆栈中以确定何时它应该转换.它假定分隔符周围的空格不重要.
Basically it splits the data on the delimiters (|
) and does a single loop over them, converting where appropriate and keeping state in a stack called nest
to determine when it should be converting. It assumes that whitespace surrounding delimiters isn't significant.
这篇关于正则表达式匹配特殊列表项 II的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!