正则表达式替换除破折号以外的非单词 [英] Regex replace non-word except dash
问题描述
我有一个正则表达式 (\W|_)[^-]
对 h_e.l_l.o - w_o.r_d
不起作用(替换字符串是" ").
I have a regex pattern (\W|_)[^-]
doesn't work for h_e.l_l.o - w_o.r_d
(replacement string is " ").
它返回如下内容:
h w
我希望至少看到这样的东西:
I hope to see at least something like this:
h e l l o - w o r d
如何替换所有非单词字符和 _
不包括 -
符号?
How can I replace all non-word characters and _
excluding the -
symbol?
推荐答案
匹配除破折号(或连字符)之外的任何非单词字符
To match any non-word char except dash (or hyphen) you may use
[^\w-]
然而,这个正则表达式不匹配_
.
However, this regular expression does not match _
.
您需要一个否定字符类来匹配除字母、数字和连字符以外的任何字符:
You need a negated character class that matches any char other than letters, digits and hyphens:
/[^-a-zA-Z0-9]+/
或(带有不区分大小写的修饰符):
or (with a case insensitive modifier):
/[^-a-z0-9]+/i
参见演示.
注意-
放在字符类的开头,不需要转义.
Note that the -
is placed at the character class start, and does not require escaping.
您可以在末尾添加一个加号以一次性匹配所有不需要的字符以一次性删除它们.
You may add a plus at the end to match all the unwanted characters at a stretch to remove them in one go.
如果你想让你的模式识别 Unicode(也就是说,在某些正则表达式风格中,如果你使用带有/不带有某些标志的速记字符类,它们也将匹配所有 Unicode 对应物),你可以使用
If you want to make your pattern Unicode aware (that is, in some regex flavors, if you use shorthand character classes with/without some flags, they will also match all Unicode counterparts), you may use
/[^\w-]|_/
请参阅正则表达式演示(或/(?:[^\w-]|_)+/
获取这些字符的整个块).
See the regex demo (or /(?:[^\w-]|_)+/
to grab the whole chunk of these chars).
这里,[^\w-]
匹配任何不是字符字符(字母、数字或下划线)的字符,第二个替代 _
匹配下划线.
Here, [^\w-]
matches any char that is not a word char (letter, digit, or underscore) and the second alternative _
matches underscores.
这篇关于正则表达式替换除破折号以外的非单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!