正则表达式匹配所有有效链接 [英] Regex to match all valid links

查看:113
本文介绍了正则表达式匹配所有有效链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于这个:http://stackoverflow.uservoice.com/pages/general/suggestions/103227-parser-does-not-match-all-valid-urls 这个正则表达式是否足够或者是否需要改进,如果需要怎么细化?

In regards to this: http://stackoverflow.uservoice.com/pages/general/suggestions/103227-parser-does-not-match-all-valid-urls is this regex adequate or will it need to be refined, if it needs to be refined how so?

\b(?P<link>(?:.*?://)[\w\-\_\.\@\:\/\?\#\=]*)\b

推荐答案

即使问题含糊不清,我也会尝试用可能的解决方案来回应.

Even though the question is vague, I'll attempt to respond with possible solutions.

可能的意图 1:匹配给定文件中的任何 URL(用于替换):

Possible Intention 1: To match any URL's in a given file (for replacement):

/^([^:]+):\/\/([-\w._]+)(\/[-\w._]\?(.+)?)?$/ig

以上应该匹配几乎所有的 URL 格式,包括以下捕获的组:

The above should match nearly all URL formats, with the following captured groups:

0 => entire match
1 => protocol (eg. http, ftp, git, ...)
2 => hostname (eg. www.stackoverflow.com)
3 => requested_file_path (eg. /images/prod/1/4/success.gif)
4 => query_string (eg. param=1&param2=2&param3=3)

可能的意图 2:获取有关当前请求 url 的详细信息

Possible Intention 2: To get details about the current request url

为了获取有关 url 的详细信息,例如协议、主机名、请求的文件路径和查询字符串,最好使用语言/对象方法来收集结果.在 php 中,您可以使用函数调用获取上述所有信息:

In order to get details about the url such as the protocol, hostname, requested file path, and query string, you're better off using language/object methods to gather the results. In php you can get all of the above information using function calls:

$protocol = $_SERVER['SERVER_PROTOCOL']; // HTTP/1.0
$host = $_SERVER['HTTP_HOST']; // www.stackoverflow.com
$path_to_file = dirname($_SERVER['SCRIPT_NAME']);
$file = basename($_SERVER['SCRIPT_NAME']);
$query_string = $_SERVER['QUERY_STRING'];

希望这对您有所帮助.

这篇关于正则表达式匹配所有有效链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆