preg_replace单词不在URL中 [英] preg_replace words not inside a url

查看:98
本文介绍了preg_replace单词不在URL中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用preg_replace替换可能包含某些url的文本中的单词列表. 问题是,如果这些单词是url的一部分,我不想替换它们.

I am using preg_replace to replace a list of words in a text that may contain some urls. The problem is that I don't want to replace these words if they're part of a url.

这些示例应被忽略:

foo.com

foo.com

foo.com/foo

foo.com/foo

foo.com/foo/foo

foo.com/foo/foo

对于一个基本示例(用php编写),我尝试使用否定的前瞻性断言来忽略包含 .com 以及可选的斜杠和字符的字符串,但未成功:

For a basic example (written in php), I tried to ignore strings containing .com and optional slashes and chars, using a negative look ahead assertion, but with no success:

preg_replace("/(\b)foo(\b)/", "$1bar$2(?!(\w+\.\w+)*(\.com)([\.\/]\w+)*)", $text);

此呼叫有效,只是忽略了 .com 之前的单词. 任何帮助将不胜感激.

This call works just ignores the word before .com. Any help would be really appreciated.

推荐答案

在这样的情况下,倒置问题的产生要容易得多.您要在网址中匹配单词 not .相反,您想将网址匹配.因此,您的表达式将如下所示:url_match_here|(?:my|words|here).这将允许正则表达式引擎首先使用URL,然后尝试匹配这些单词.因此,您不必担心匹配URL中的单词.如果要维护文本结构,可以使用带有以下表达式(url_match_here)|(?:my|words|here)preg_replace并替换为\1来保留URL和文本.

In cases like these, its much easier to think of the problem inverted. You want to match words not in an url. Instead think, you want to match the url and the words. So, your expression would look like this: url_match_here|(?:my|words|here). This will allow the regex engine to consume the URL first and then try to match those words. Thus, you never have to worry about matching the words inside an URL. If you want to maintain the text structure, you can use preg_replace, with the following expression (url_match_here)|(?:my|words|here) and replace by \1 to preserve the URL and the text.

我希望这会有所帮助.

祝你好运.

这篇关于preg_replace单词不在URL中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆