多个 preg_replace RegEx 用于不同的 URL [英] Multiple preg_replace RegEx for different URLs
问题描述
我有一个这样的字符串:
I have a string like this:
Blablabla http://www.soundcloud.com/artist/track
www.facebook.com/page is my page
Try www.youtube.com/watch?v=1234567 for my video
Check http://www.somesite.com/bla.
我想替换 URL 并在用户生成的帖子中插入不同的 wordpress 短代码,自动与视频或 soundcloud 小部件交换 url,并从所有其他 URL 和电子邮件创建常规链接到这样的东西(简化):
I would like to replace URLs and insert different wordpress shortcodes inside a user generated post, exchange urls with videos or soundcloud widgets automatically and create regular links from all the other URLs and emails into something like this (simplified):
Blablabla [soundcloud]www.soundcloud.com/artist/track[/soundcloud]
[facebook]www.facebook.com/page[/facebook] is my page
Try [youtube]www.youtube.com/watch?v=1234567[/youtube] for my video
Check [url]www.somesite.com/bla[/url].
所以我想我需要对字符串运行几个 preg_replace 操作.
So I think I need to run several preg_replace actions on the string.
在我用 Wordpress 短代码替换 Soundcloud、Facebook 和 Youtube URL 后,我需要在剩余的 URL 上运行 preg_replace,例如 http://www.somesite.com/bla 但由于 Facebook/Soundcloud/Youtube 模式在字符串中仍然可用(现在在短代码中),它们将再次被替换为...
After I replaced Soundcloud, Facebook and Youtube URLs with the Wordpress shortcodes I need to run a preg_replace on the remaining URLs like http://www.somesite.com/bla but since the Facebook/Soundcloud/Youtube patterns are still available in the string (now inside the shortcodes) they will be replaced again into...
[youtube][url]www.youtube.com/watch?v=1234567[/url][/youtube]
我不想要这种行为.我应该是这样的:
I do not want this behaviour. I should be like this:
[url]www.youtube.com/watch?v=1234567[/url]
这是我的基本正则表达式:
This is my basic RegEx:
((https?://)(www.)|(https?://)|(www.))[^ <]+
((https?://)(www.)|(https?://)|(www.))[^ <]+
我需要替换以 http、https 和 www 开头的 URL
I need to replace URLs beginning with http, https and www
有人解决了吗?
问候,
垫子
推荐答案
我建议您查看 preg_replace_callback 函数代替.
I'd recommend you look into the preg_replace_callback function instead.
与其尝试匹配不同的 url 子集,对于每个不同的站点,只需将它们全部匹配即可!然后,在代码中检查特定的捕获组以检查 url 的基础
Rather than trying to match different subsets of urls, for each different site, just match them all! Then, in code check a specific capturing group to check the base of the url
因此,在 php 代码中,如果 url 以 facebook 开头,则将 url 替换为 facebook 短代码,依此类推.
So, in php code, if the url starts with facebook, replace the url with the facebook shortcode, and so on.
这是您的正则表达式,稍作修改以捕获域.记住要逃避你的字面意思.这只会捕获域的第一个 <
/
?
或空格,然后直到第一个 <
或 URL 其余部分的空格.如果您发现任何不适用于此方法,则可能需要对其进行修改.
Here's your regex, slighly modified to capture the domain. Remember to escape your literal periods. This just captures up to the first <
/
?
or whitespace for the domain, then until the first <
or whitespace for the rest of the URL. You might have to modify this if you find anything that this doesn't work for.
((https?://)(www\.)|(https?://)|(www\.))([^</\?\s]+)[^<\s]*
现在是一些 php 代码.回想一下 $matches[0] 将具有完整匹配,而 $matches[6] 将具有第 6 个捕获组 - 在这种情况下 ([^</\?\s]+)
,域部分
And now some php code. Recall that $matches[0] will have the full match, and $matches[6] will have the 6th caputuring group - in this case ([^</\?\s]+)
, the domain part
$post = preg_replace_callback(
'/((https?:\/\/)(www\.)|(https?:\/\/)|(www\.))([^<\/\?\s]+)[^<\s]*/',
function ($matches) {
switch($matches[6]){
case 'facebook.com':
return "[facebook]" . $matches[0] . "[/facebook]";
case 'youtube.com':
return "[youtube]" . $matches[0] . "[/youtube]";
case 'soundcloud.com':
return "[soundcloud]" . $matches[0] . "[/soundcloud]";
default:
return "[url]" . $matches[0] . "[/url]";
}
},
$post);
这篇关于多个 preg_replace RegEx 用于不同的 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!