Javascript:从字符串(包括查询字符串)中提取 URL 并返回数组 [英] Javascript: extract URLs from string (inc. querystring) and return array

查看：22 发布时间：2021/12/13 0:08:13 javascript jquery parsing url extract

本文介绍了Javascript:从字符串(包括查询字符串)中提取 URL 并返回数组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我知道这之前已经被问过一千次了(抱歉)，但是搜索 SO/Google 等我还没有得到最终的答案.

I know this has been asked a thousand times before (apologies), but searching SO/Google etc I am yet to get a conclusive answer.

基本上，我需要一个 JS 函数，它在传递字符串时识别 &基于正则表达式提取所有 URL，返回所有找到的数组.例如:

Basically, I need a JS function which when passed a string, identifies & extracts all URLs based on a regex, returning an array of all found. e.g:

function findUrls(searchText){
    var regex=???
    result= searchText.match(regex);
    if(result){return result;}else{return false;}
}

该函数应该能够检测并返回任何潜在的 url.我知道这方面的固有困难/问题(右括号等)，所以我觉得这个过程需要:

The function should be able to detect and return any potential urls. I am aware of the inherant difficulties/isses with this (closing parentheses etc), so I have a feeling the process needs to be:

将字符串 (searchText) 拆分为不同的部分，开始/结束)，其两侧没有任何内容、空格或回车，从而产生不同的内容块，例如进行拆分.

Split the string (searchText) into distinct sections starting/ending) with either nothing, a space or carriage return either side of it, resulting in distinct content chunks, e.g. do a split.

对于拆分产生的每个内容块，查看它是否符合任何结构的 URL 的逻辑，即它是否包含紧跟文本的句点(限定潜在 URL 的一个常量规则).

For each content chunk that results from the split, see whether it fits the logic for a URL of any construction, namely, does it contain a period immediately followed the text (the one constant rule for qualifying a potential URL).

正则表达式应该查看句点后面是否紧跟其他文本，类型为 tld、目录结构 &查询字符串，并以 URL 允许类型的文本开头.

The regex should see whether the period is immediately followed by other text, of the type allowable for a tld, directory structure & query string, and preceded by text of the allowable type for a URL.

我知道可能会导致误报，但是将通过调用 URL 本身来检查任何返回的值，因此可以忽略这一点.我发现的其他函数通常也不会返回 URL 查询字符串(如果存在).

I am aware false positives may result, however any returned values will then be checked with a call to the URL itself, so this can be ignored. The other functions I have found often dont return the URLs query string too, if present.

因此，从文本块中，该函数应该能够返回任何类型的 URL，即使这意味着将 will.i.am 识别为有效的 URL！

From a block of text, the function should thus be able to return any type of URL, even if it means identifying will.i.am as a valid one!

例如.http://www.google.com、google.com、www.google.com、http://google.com,ftp.google.com、https://等...及其任何带有查询字符串的派生应该被退回...

eg. http://www.google.com, google.com, www.google.com, http://google.com, ftp.google.com, https:// etc...and any derivation thereof with a query string should be returned...

非常感谢，如果这在 SO 上的其他地方存在，但我的搜索没有返回它，再次道歉..

Many thanks, apologies again if this exists elsewhere on SO but my searches havent returned it..

推荐答案

我只使用 URI.js -- 很容易.

I just use URI.js -- makes it easy.

var source = "Hello www.example.com,
"
    + "http://google.com is a search engine, like http://www.bing.com
"
    + "http://exämple.org/foo.html?baz=la#bumm is an IDN URL,
"
    + "http://123.123.123.123/foo.html is IPv4 and "
    + "http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html is IPv6.
"
    + "links can also be in parens (http://example.org) "
    + "or quotes »http://example.org«.";

var result = URI.withinString(source, function(url) {
    return "<a>" + url + "</a>";
});

/* result is:
Hello <a>www.example.com</a>,
<a>http://google.com</a> is a search engine, like <a>http://www.bing.com</a>
<a>http://exämple.org/foo.html?baz=la#bumm</a> is an IDN URL,
<a>http://123.123.123.123/foo.html</a> is IPv4 and <a>http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html</a> is IPv6.
links can also be in parens (<a>http://example.org</a>) or quotes »<a>http://example.org</a>«.
*/

https://github.com/medialize/URI.js
http://medialize.github.io/URI.js/

这篇关于Javascript:从字符串(包括查询字符串)中提取 URL 并返回数组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Javascript:从字符串(包括查询字符串)中提取 URL 并返回数组 [英] Javascript: extract URLs from string (inc. querystring) and return array

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

Javascript:从字符串(包括查询字符串)中提取 URL 并返回数组 [英] Javascript: extract URLs from string (inc. querystring) and return array

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭