如果存在某个单词,则RegEx排除匹配项,但不存在另一个部分单词 [英] RegEx to exclude match if a certain word is present, but not another partial word

查看:68
本文介绍了如果存在某个单词,则RegEx排除匹配项,但不存在另一个部分单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我们的防火墙用来阻止成人站点的关键字"cum",问题在于它的工作原理太好了,因为它还会阻止任何带有"document"一词的URL.

I have the keyword "cum" which our firewall uses to block adult sites, problem is this works a little too well because this also blocks any URL with the word "document"

防火墙将使用正则表达式字符串,而我尝试这样做:

The firewall will take regex strings, and I tried this:

^.*(?!document)cum.*$

请注意,它仍然与文档"匹配.我有一种使用管道|的感觉,但我不明白.

Vut it still matches "document". I have a feeling I should be using a pipe | but I don't get it.

我想要在任何地方匹配

*cum*

在URL(或域名)中找到

,但是如果单词是documentdocuments,则找不到.

is found in the URL (or domain-name), but NOT if the word is document or documents.

可能吗?据我了解,单词边界在这里是行不通的,因为cum单词在URL中不一定要用空格隔开,而在域名中则不一定要用空格隔开.

Possible? As I understand it, a word boundary doesn't work here because the word cum won't necessarily be separated by white-space when it's in a URL, and definitely not if it's in a domain-name.

这里是另一种表达方式:

Here's another way to put it:

Allow "examplesearchdocuments.com"
Allow "examplemydocuments.com"
Allow "documentexample.com"
Allow "example.com/somedocuments"
Don't allow "funnycumsiteexample.com"
Don't allow "cumallovereverythingexample.com"
Don't allow "exampleseemycum.com"

其中,cum是不正确的单词匹配.很抱歉,如果这些示例中的任何一个都是真实的网站,我不知道该如何传达.

where cum being the bad word match. Sorry if any of these examples are real sites, I don't know how else to convey this.

推荐答案

根据评论,我错了.

如果在前瞻中使用后退,则只有当"cum"不在"document"一词中时,您才可以匹配它.

If you use a lookbehind inside your lookahead, you can match "cum" only if it is not within the word "document".

cum(?!(?<=docum)ent)

这里是有关环顾四周的一些阅读材料 http://www.regular-expressions.info/lookaround. html

Here is some reading on lookaround http://www.regular-expressions.info/lookaround.html

这里是针对大量测试的.

Here it is against a large number of tests.

http://www.rubular.com/r/b5iZrn6Cjz

这篇关于如果存在某个单词,则RegEx排除匹配项,但不存在另一个部分单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆