多行python正则表达式 [英] Multiline python regex

查看：74 发布时间：2021/6/4 19:44:05 python regex multiline regex-greedy

本文介绍了多行python正则表达式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个这样结构的文件:

I have a file structured like this :

A: some text
B: more text
even more text
on several lines
A: and we start again
B: more text
more
multiline text

我正在尝试找到可以像这样拆分我的文件的正则表达式:

I'm trying to find the regex that will split my file like this :

>>>re.findall(regex,f.read())
[('some text','more text','even more text\non several lines'),
 ('and we start again','more text', 'more\nmultiline text')]

到目前为止，我得到了以下结果:

So far, I've ended up with the following :

>>>re.findall('A:(.*?)\nB:(.*?)\n(.*?)',f.read(),re.DOTALL)
[(' some text', ' more text', ''), (' and we start again', ' more text', '')]

未捕获多行文本.我猜是因为懒惰的限定符真的很懒惰，什么也抓不到，但我把它拿出来，正则表达式变得非常贪婪:

The multiline text is not catched. I guess is because the lazy qualifier is really lazy and catch nothing, but I take it out, the regex gets really greedy :

>>>re.findall('A:(.*?)\nB:(.*?)\n(.*)',f.read(),re.DOTALL)
[(' some text',
' more text',
'even more text\non several lines\nA: and we start again\nB: more text\nmore\nmultiline text')]

有人有想法吗?谢谢！

推荐答案

您可以告诉正则表达式在以 A: 开头的下一行(或在字符串的末尾)停止匹配:

You could tell the regex to stop matching at the next line that starts with A: (or at the end of the string):

re.findall(r'A:(.*?)\nB:(.*?)\n(.*?)(?=^A:|\Z)', f.read(), re.DOTALL|re.MULTILINE)

这篇关于多行python正则表达式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

多行python正则表达式 [英] Multiline python regex

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

多行python正则表达式 [英] Multiline python regex

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭