python regex用于重复字符串 [英] python regex for repeating string

查看:101
本文介绍了python regex用于重复字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要验证然后解析此字符串(用引号引起来):

I am wanting to verify and then parse this string (in quotes):

string = "start: c12354, c3456, 34526; other stuff that I don't care about"
//Note that some codes begin with 'c'

我想验证字符串是否以'start:'开头并以';'结尾 之后,我想让一个正则表达式解析出字符串.我尝试了以下python re代码:

I would like to verify that the string starts with 'start:' and ends with ';' Afterward, I would like to have a regex parse out the strings. I tried the following python re code:

regx = r"start: (c?[0-9]+,?)+;" 
reg = re.compile(regx)
matched = reg.search(string)
print ' matched.groups()', matched.groups()

我尝试了不同的变体,但是我可以获取第一个或最后一个代码,但不能获取所有这三个代码的列表.

I have tried different variations but I can either get the first or the last code but not a list of all three.

还是我应该放弃使用正则表达式?

Or should I abandon using a regex?

已更新,以反映我忽略的部分问题空间并修复了字符串差异. 感谢您的所有建议-在这么短的时间内.

updated to reflect part of the problem space I neglected and fixed string difference. Thanks for all the suggestions - in such a short time.

推荐答案

在Python中,使用单个正则表达式是不可能的:组的每次捕获都将覆盖同一组的最后一次捕获(在.NET中,实际上是有可能的,因为引擎会区分捕获和分组.

In Python, this isn’t possible with a single regular expression: each capture of a group overrides the last capture of that same group (in .NET, this would actually be possible since the engine distinguishes between captures and groups).

最简单的解决方案是先 提取start:;之间的部分,然后使用正则表达式返回所有匹配项,而不仅仅是单个匹配项,使用 re.findall('c?[0-9]+', text) .

Your easiest solution is to first extract the part between start: and ; and then using a regular expression to return all matches, not just a single match, using re.findall('c?[0-9]+', text).

这篇关于python regex用于重复字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆