正则表达式:\ w-"_" +“-"在UTF-8中 [英] RegEx: \w - "_" + "-" in UTF-8

查看:103
本文介绍了正则表达式:\ w-"_" +“-"在UTF-8中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个匹配UTF-8字母和数字,破折号(-)但不匹配下划线(_)的正则表达式,但我尝试了这些愚蠢的尝试,但未成功:

I need a regular expression that matches UTF-8 letters and digits, the dash sign (-) but doesn't match underscores (_), I tried these silly attempts without success:

  • ([\w-^_])+
  • ([\w^_]-?)+
  • (\w[^_]-?)+
  • ([\w-^_])+
  • ([\w^_]-?)+
  • (\w[^_]-?)+

\w[A-Za-z0-9_]的简写,但如果设置了u修饰符,它也可以匹配UTF-8字符.

The \w is shorthand for [A-Za-z0-9_], but it also matches UTF-8 chars if I have the u modifier set.

有人可以帮我吗?

推荐答案

尝试一下:

(?:[\w\-](?<!_))+

它对编码为\ w(或破折号)的任何东西进行简单匹配,然后在后面留有零宽度,以确保刚匹配的字符不是下划线.

It does a simple match on anything that is encoded as a \w (or a dash) and then has a zero-width lookbehind that ensures that the character that was just matched is not a underscore.

否则,您可以选择这个:

Otherwise you could pick this one:

(?:[^_\W]|-)+

这是一种基于集合的方法(请注意大写的W)

which is a more set-based approach (note the uppercase W)

好的,我对php的PCRE风格的unicode有很多乐趣:D Peekaboo说有一个简单的解决方案:

OK, I had a lot of fun with unicode in php's flavor of PCREs :D Peekaboo says there is a simple solution available:

[\p{L}\p{N}\-]+

\ p {L}匹配任何符合字母的unicode(注意:不是单词字符,因此没有下划线),而\ p {N}匹配任何看起来像数字的东西(包括罗马数字和更多奇特的东西) ).
\-只是一个逃脱的破折号.尽管不是绝对必要,但我倾向于将字符类中的短划线转义为重点...请注意,unicode中有数十种不同的短划线,因此产生了以下版本:

\p{L} matches anything unicode that qualifies as a Letter (note: not a word character, thus no underscores), while \p{N} matches anything that looks like a number (including roman numerals and more exotic things).
\- is just an escaped dash. Although not strictly necessary, I tend to make it a point to escape dashes in character classes... Note, that there are dozens of different dashes in unicode, thus giving rise to the following version:

[\p{L}\p{N}\p{Pd}]+

其中"Pd"是标点符号,包括但不限于我们的减号-小东西. (请注意,此处再次没有下划线).

Where "Pd" is Punctuation Dash, including, but not limited to our minus-dash-thingy. (Note, again no underscore here).

这篇关于正则表达式:\ w-"_" +“-"在UTF-8中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆