python中的正则表达式嵌套括号 [英] Regex nested parenthesis in python

查看:77
本文介绍了python中的正则表达式嵌套括号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的事情:

Othername California (2000) (T) (S) (ok) {state (#2.1)}

是否有正则代码获取:

别称加利福尼亚 ok 2.1

即我想将数字保留在圆括号内,而圆括号又位于 {} 内,并保留 () 内的文本ok".我特别需要打印出字符串ok",如果包含在我的行中,但我想去掉括号内的其他文本,例如 (V)、(S) 或 (2002).

我知道正则表达式可能不是处理此类问题的最有效方法.

任何帮助将不胜感激.

该字符串可能会有所不同,因为如果某些信息不可用,则该行中不包含该信息.文本本身也是可变的(例如,我没有每一行都有状态").所以可以有例如:

Name1 Name2 Name3 (2000) (ok) {edu (#1.1)}Name1 Name2 (2002) {edu (#1.1)}Name1 Name2 Name3 (2000) (V) {variation (#4.12)}

解决方案

Regex

(.+)\s+\(\d+\).+?(?:\(([^)]{2,})\)\s+(?={))?\{.+\(#(\d+\.\d+)\)\}

用于测试的文本

<前>Name1 Name2 Name3 (2000) {Education (#3.2)}Name1 Name2 Name3 (2000) (ok) {edu (#1.1)}Name1 Name2 (2002) {edu (#1.1)}Name1 Name2 Name3 (2000) (V) {variation (#4.12)}其他名称加利福尼亚 (2000) (T) (S) (ok) {state (#2.1)}

测试

<前>>>> regex = re.compile("(.+)\s+\(\d+\).+?(?:\(([^)]{2,})\)\s+(?={))?\{.+\(#(\d+\.\d+)\)\}")>>> r = regex.search(string)>>> r<_sre.SRE_Match 对象在 0x54e2105f36c16a48>>>> regex.match(string)<_sre.SRE_Match 对象在 0x54e2105f36c169e8># 运行 findall>>> regex.findall(string)[(u'Name1 Name2 Name3' , u'' , u'3.2'),(u'Name1 Name2 Name3', u'ok', u'1.1'),(u'Name1 Name2' , u'' , u'1.1'),(u'Name1 Name2 Name3' , u'' , u'4.12'),(u'Othername California', u'ok', u'2.1')]

I have something like this:

Othername California (2000) (T) (S) (ok) {state (#2.1)}

Is there a regex code to obtain:

Othername California ok 2.1

I.e. I would like to keep the numbers within round parenthesis which are in turn within {} and keep the text "ok" which is within (). I specifically need the string "ok" to be printed out, if included in my lines, but I would like to get rid of other text within parenthesis eg (V), (S) or (2002).

I am aware that probably regex is not the most efficient way to handle such a problem.

Any help would be appreciated.

EDIT:

The string may vary since if some information is unavailable is not included in the line. Also the text itself is mutable (eg. I don't have "state" for every line). So one can have for example:

Name1 Name2 Name3 (2000) (ok) {edu (#1.1)}
Name1 Name2 (2002) {edu (#1.1)}
Name1 Name2 Name3 (2000) (V) {variation (#4.12)}

解决方案

Regex

(.+)\s+\(\d+\).+?(?:\(([^)]{2,})\)\s+(?={))?\{.+\(#(\d+\.\d+)\)\}

Text used for test

Name1 Name2 Name3 (2000) {Education (#3.2)}
Name1 Name2 Name3 (2000) (ok) {edu (#1.1)}
Name1 Name2 (2002) {edu (#1.1)}
Name1 Name2 Name3 (2000) (V) {variation (#4.12)}
Othername California (2000) (T) (S) (ok) {state (#2.1)}

Test

>>> regex = re.compile("(.+)\s+\(\d+\).+?(?:\(([^)]{2,})\)\s+(?={))?\{.+\(#(\d+\.\d+)\)\}")
>>> r = regex.search(string)
>>> r
<_sre.SRE_Match object at 0x54e2105f36c16a48>
>>> regex.match(string)
<_sre.SRE_Match object at 0x54e2105f36c169e8>

# Run findall
>>> regex.findall(string)
[
   (u'Name1 Name2 Name3'   , u''  , u'3.2'),
   (u'Name1 Name2 Name3'   , u'ok', u'1.1'),
   (u'Name1 Name2'         , u''  , u'1.1'),
   (u'Name1 Name2 Name3'   , u''  , u'4.12'),
   (u'Othername California', u'ok', u'2.1')
]

这篇关于python中的正则表达式嵌套括号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆