从字符类中排除字符 [英] Exclude characters from a character class

查看:147
本文介绍了从字符类中排除字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种简单的方法来匹配一个类中除某些字符集之外的所有字符?例如,如果在lanaguage中可以使用\ w匹配所有unicode单词字符的集合,是否有办法从该匹配中排除下划线"_"之类的字符?

Is there a simple way to match all characters in a class except a certain set of them? For example if in a lanaguage where I can use \w to match the set of all unicode word characters, is there a way to just exclude a character like an underscore "_" from that match?

我想到的唯一想法是在每个字符周围使用负前行/后退,但是当我实际上只想将一个字符与一个正匹配和一个负匹配进行匹配时,这似乎比必要的更为复杂.例如,如果&是一个AND运算符,我可以做到这一点...

Only idea that came to mind was to use negative lookahead/behind around each character but that seems more complex than necessary when I effectively just want to match a character against a positive match AND negative match. For example if & was an AND operator I could do this...

^(\w&[^_])+$

推荐答案

这确实取决于您的正则表达式风格.

It really depends on your regex flavor.

...仅提供一个简单的字符类集操作:减法.对于您的示例而言,这就足够了,因此您可以简单地使用

... provides only one simple character class set operation: subtraction. This is enough for your example, so you can simply use

[\w-[_]]

如果-之后是嵌套的字符类,则将其减去.就这么简单...

If a - is followed by a nested character class, it's subtracted. Simple as that...

...提供了一组更丰富的

... provides a much richer set of character class set operations. In particular you can get the intersection of two sets like [[abc]&&[cde]] (which would give c in this case). Intersection and negation together give you subtraction:

[\w&&[^_]]

Perl

...支持对扩展字符类的设置操作作为实验功能(自Perl 5.18起可用).特别是,您可以直接减去任意字符类:

Perl

... supports set operations on extended character classes as an experimental feature (available since Perl 5.18). In particular, you can directly subtract arbitrary character classes:

(?[ \w - [_] ])

所有其他口味

...(支持前瞻)允许您使用负前瞻来模拟减法:

All other flavors

... (that support lookaheads) allow you to mimic the subtraction by using a negative lookahead:

(?!_)\w

这首先检查下一个字符是否不是_,然后匹配任何\w(由于负前瞻而不能为_).

This first checks that the next character is not a _ and then matches any \w (which can't be _ due to the negative lookahead).

请注意,每种方法都是完全通用的,因为您可以减去两个任意复杂的字符类.

Note that each of these approaches is completely general in that you can subtract two arbitrarily complex character classes.

这篇关于从字符类中排除字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆