如何搜索不在任何html标签中的网址,然后将它们变成超链接? [英] How to search urls that are not in any html tag and then turn them into hyperlinks?

查看:17
本文介绍了如何搜索不在任何html标签中的网址,然后将它们变成超链接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我的问题是,在相同的内容中有 iframe、图像标签等.它们都有正则表达式匹配,可以将它们转换为正确的格式.

So my problem is that, in the same content there are iframes, image tags and etc. They all have regex matches that will convert them into the correct format.

剩下的最后一件事是普通的 URL.我需要一个正则表达式,它会找到所有只是链接而不是 iframe、img 或任何其他标签内的链接.在这种情况下使用的标签是常规的 HTML 标签,而不是 BB.

The last thing left is the normal URL. I need a regex, that will find all links that are simply links and not inside of a iframe, img or any other tag. Tags used in this case are regular HTML tags and not BB.

目前我把这段代码作为内容渲染的最后一遍.但它也会对上面完成的所有其他事情(iframe 和 img 渲染)做出反应.所以它也会去那里交换 url.

Currently I got this code as the last pass of the content rendering. But it will also react to all the other things done above (iframes and img renderings.) So it goes and swaps the urls out there aswell.

$output = preg_replace(array(
    '%\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s'
), array(
    'test'
), $output);

我的内容看起来像这样:

And my content looks something like this:

# dont want these to be touched
<iframe width="640" height="360" src="http://somedomain.com/but-still-its-a-link-to-somewhere/" frameborder="0"></iframe>
<img src="http://someotherdomain.com/here-is-a-img-url.jpg" border="0" />

# and only these converted
http://google.com
http://www.google.com
https://www2.google.com<br />
www.google.com

如您所见,链接末尾可能还有一些内容.经过一整天的尝试正则表达式的工作,最后一个 <br/> 对我来说是一场噩梦.

As you can see, there also might be something at the end of the link. After a full day of trying regexes to work, that last <br /> has been a nightmare for me.

推荐答案

描述

此解决方案将匹配不在标签属性值内的 url,并将用新的内容替换它们.

Description

This solution will match the urls which are not inside tag attribute values, and will replace them with something new.

正则表达式匹配您跳过的内容和替换的内容.然后 preg_match_callback 执行一个内部函数,该函数测试是否填充了捕获组 1(这是所需的文本),如果是,则返回更改,否则仅返回不需要的文本.

The regular expression matches both the things you skipped over and the things you replaced. Then the preg_match_callback executes an internal function which tests to see if capture group 1 is populated (this is the desired text) and if so returns the change, otherwise it simply returns the undesired text.

我使用了您的 url 匹配正则表达式并进行了一些小的修改,例如将未使用的捕获组 (...) 转换为非捕获组 (?:...).这使得正则表达式引擎运行得更快,并且更容易修改表达式.

I used your url matching regex with some minor modifications like converting the unused capture groups (...) to non-capture groups (?:...). This makes the regex engine run faster and makes it easier to modify the expression.

原始表达式:<(?:[^'">=]*|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*>|((?:[\w-]+:\/\/?|www[.])[^\s()<>]+(?:\([\w\d]+\)|(?:[^[:punct:]\s]|\/)))

代码

<?php

$string = '# dont want these to be touched
<iframe width="640" height="360" src="http://somedomain.com/but-still-its-a-link-to-somewhere/" frameborder="0"></iframe>
<img src="http://someotherdomain.com/here-is-a-img-url.jpg" border="0" />

# and only these converted
http://google.com
http://www.google.com
https://www2.google.com<br />
www.google.com';


    $regex = '/<(?:[^\'">=]*|=\'[^\']*\'|="[^"]*"|=[^\'"][^\s>]*)*>|((?:[\w-]+:\/\/?|www[.])[^\s()<>]+(?:\([\w\d]+\)|(?:[^[:punct:]\s]|\/)))/ims';

    $output = preg_replace_callback(
        $regex,
        function ($matches) {
            if (array_key_exists (1, $matches)) {
                return '<a href="' . $matches[1] . '">' . $matches[1] . '<\/a>';
            }
            return $matches[0];
        },
        $string
    );
    echo $output;

输出

# dont want these to be touched
<iframe width="640" height="360" src="http://somedomain.com/but-still-its-a-link-to-somewhere/" frameborder="0"></iframe>
<img src="http://someotherdomain.com/here-is-a-img-url.jpg" border="0" />

# and only these converted
<a href="http://google.com">http://google.com<\/a>
<a href="http://www.google.com">http://www.google.com<\/a>
<a href="https://www2.google.com">https://www2.google.com<\/a><br />
<a href="www.google.com">www.google.com<\/a>

这篇关于如何搜索不在任何html标签中的网址,然后将它们变成超链接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆