在python正则表达式中将1个单词限制为区分大小写,其他不区分大小写 |(管道) [英] restrict 1 word as case sensitive and other as case insensitive in python regex | (pipe)

查看:80
本文介绍了在python正则表达式中将1个单词限制为区分大小写,其他不区分大小写 |(管道)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在正则表达式、Python 中理解了 |(管道特殊字符)的含义.它匹配第一个或第二个.

ex : a|b 匹配 a 或 b.

我的问题:如果我想在上面的例子中匹配 a 区分大小写和 b 不区分大小写怎么办?

例如:

s = "欢迎来到马哈拉施特拉邦 PuNe"result1 = re.search("punnee|MaHaRaShTrA",s)result2 = re.search("pune|maharashtra",s)result3 = re.search("PuNe|MaHaRaShTrA",s)result4 = re.search("P|MaHaRaShTrA",s)

我想按照我在上述语句 s 中所写的方式搜索 Pune,即 PuNe.但是我必须通过忽略大小写来搜索马哈拉施特拉邦.如何搜索区分大小写的 1 个单词和不区分大小写的其他单词?这样,result1result2result3result4 将给出 not null 值.

我试过了:

result1 = re.search("pune|MaHaRaShTrA",s1, re.IGNORECASE)

但这忽略了这两个词的大小写.

如何限制 1 个单词区分大小写而其他单词不区分大小写?

解决方案

在 Python 3.6 及更高版本中,您可以使用 内联修饰符组:

<预><代码>>>>s = "欢迎来到马哈拉施特拉邦 PuNe">>>打印(re.findall(r"PuNe|(?i:MaHaRaShTrA)",s))['PuNe','马哈拉施特拉']

请参阅相关的Python re 文档:

<块引用>

(?aiLmsux-imsx:...)
   ('a' 中的零个或多个字母,'i', 'L', 'm', 's', 'u', 'x', 可选后跟 '-' 后跟 'i', 中的一个或多个字母'm', 's', 'x'.) 字母设置或移除相应的标志:re.A(仅 ASCII 匹配),re.I(忽略大小写),re.L(取决于语言环境), re.M (多行),re.S(点匹配所有)、re.U(Unicode 匹配)和 re.X(详细),用于表达式部分.(这些标志在模块内容.)

字母 'a''L''u' 在用作内联标志时是互斥的,因此它们可以't 组合或跟随 '-'.相反,当其中一个出现在内联组中时,它会覆盖封闭组中的匹配模式.在 Unicode 模式中,(?a:...) 切换到仅 ASCII 匹配,而 (?u:...) 切换到 Unicode 匹配(默认).在字节模式中,(?L:...) 切换到语言环境依赖匹配,而 (?a:...) 切换到仅 ASCII 匹配(默认).此覆盖仅对窄内联组有效,并且原匹配模式在群外恢复.

3.6 版新功能.

3.7 版更改:字母'a'、'L' 和 'u' 也可以用在一个组中.

不幸的是,3.6 之前的 Python re 版本不支持这些组,也不支持交替打开和关闭内联修饰符.

如果您可以使用 PyPi 正则表达式模块,您可以使用 (?i:...) 构造:

导入正则表达式s = "欢迎来到马哈拉施特拉邦 PuNe"打印(regex.findall(r"PuNe|(?i:MaHaRaShTrA)",s))

请参阅在线 Python 演示.

I got the meaning of | (pipe special character) in regex, Python. It matches either 1st or 2nd.

ex : a|b Matches either a or b.

My question: What if I want to match is a with case sensitive and b with case insensitive in above example?

ex:

s = "Welcome to PuNe, Maharashtra"

result1 = re.search("punnee|MaHaRaShTrA",s)
result2 = re.search("pune|maharashtra",s)
result3 = re.search("PuNe|MaHaRaShTrA",s)
result4 = re.search("P|MaHaRaShTrA",s)

I want to search Pune in the way I have written in above statement s i.e PuNe. But I have to search Maharashtra by ignoring case. How can I search 1 word with case sensitive and other with case insensitive? So that, result1, result2, result3, result4 will give not null value.

I tried:

result1 = re.search("pune|MaHaRaShTrA",s1, re.IGNORECASE)

But this ignores the cases for both the words.

How can I restrict 1 word as case sensitive and other as case insensitive?

解决方案

In Python 3.6 and later, you may use the inline modifier groups:

>>> s = "Welcome to PuNe, Maharashtra"
>>> print(re.findall(r"PuNe|(?i:MaHaRaShTrA)",s))
['PuNe', 'Maharashtra']

See the relevant Python re documentation:

(?aiLmsux-imsx:...)
   (Zero or more letters from the set 'a', 'i', 'L', 'm', 's', 'u', 'x', optionally followed by '-' followed by one or more letters from the 'i', 'm', 's', 'x'.) The letters set or remove the corresponding flags: re.A (ASCII-only matching), re.I (ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), re.U (Unicode matching), and re.X (verbose), for the part of the expression. (The flags are described in Module Contents.)

The letters 'a', 'L' and 'u' are mutually exclusive when used as inline flags, so they can’t be combined or follow '-'. Instead, when one of them appears in an inline group, it overrides the matching mode in the enclosing group. In Unicode patterns (?a:...) switches to ASCII-only matching, and (?u:...) switches to Unicode matching (default). In byte pattern (?L:...) switches to locale depending matching, and (?a:...) switches to ASCII-only matching (default). This override is only in effect for the narrow inline group, and the original matching mode is restored outside of the group.

New in version 3.6.

Changed in version 3.7: The letters 'a', 'L' and 'u' also can be used in a group.

Unfortunately, Python re versions before 3.6 did not support these groups, nor did they support alternating on and off inline modifiers.

If you can use PyPi regex module, you may use a (?i:...) construct:

import regex
s = "Welcome to PuNe, Maharashtra"
print(regex.findall(r"PuNe|(?i:MaHaRaShTrA)",s))

See the online Python demo.

这篇关于在python正则表达式中将1个单词限制为区分大小写,其他不区分大小写 |(管道)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆