使用 JavaScript 检测文本中的 URL [英] Detect URLs in text with JavaScript

查看:25
本文介绍了使用 JavaScript 检测文本中的 URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人对检测一组字符串中的 URL 有什么建议吗?

Does anyone have suggestions for detecting URLs in a set of strings?

arrayOfStrings.forEach(function(string){
  // detect URLs in strings and do something swell,
  // like creating elements with links.
});

更新:我最终使用这个正则表达式进行链接检测……显然是几年后.

Update: I wound up using this regex for link detection… Apparently several years later.

kLINK_DETECTION_REGEX = /(([a-z]+://)?(([a-z0-9-]+.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(/[a-z0-9_-.~]+)*(/([a-z0-9_-.]*)(?[a-z0-9+_-.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:@/?]*)?)(s+|$)/gi

完整的帮助程序(带有可选的 Handlebars 支持)位于 gist #1654670.

The full helper (with optional Handlebars support) is at gist #1654670.

推荐答案

首先你需要一个匹配 url 的好的正则表达式.这很难做到.请参阅此处这里这里:

First you need a good regex that matches urls. This is hard to do. See here, here and here:

...几乎任何东西都是有效的 URL.那里是一些标点规则把它分开.没有任何标点符号,你还有一个有效的网址.

...almost anything is a valid URL. There are some punctuation rules for splitting it up. Absent any punctuation, you still have a valid URL.

仔细检查RFC,看看你是否可以构造一个无效"的 URL.这规则非常灵活.

Check the RFC carefully and see if you can construct an "invalid" URL. The rules are very flexible.

例如 ::::: 是一个有效的 URL.路径是":::::".一个漂亮愚蠢的文件名,但一个有效的文件名.

For example ::::: is a valid URL. The path is ":::::". A pretty stupid filename, but a valid filename.

此外,///// 是一个有效的 URL.这netloc(主机名")是 "".路径是 "///".又是笨蛋.还有效的.此 URL 规范化为 "///"这是等效的.

Also, ///// is a valid URL. The netloc ("hostname") is "". The path is "///". Again, stupid. Also valid. This URL normalizes to "///" which is the equivalent.

类似"bad://///worse/////"是完全有效的.愚蠢但有效.

Something like "bad://///worse/////" is perfectly valid. Dumb but valid.

无论如何,这个答案并不是要为您提供最好的正则表达式,而是要证明如何使用 JavaScript 在文本中进行字符串换行.

Anyway, this answer is not meant to give you the best regex but rather a proof of how to do the string wrapping inside the text, with JavaScript.

好的,让我们使用这个:/(https?://[^s]+)/g

OK so lets just use this one: /(https?://[^s]+)/g

再说一次,这是一个糟糕的正则表达式.它会有很多误报.不过对于这个例子来说已经足够了.

Again, this is a bad regex. It will have many false positives. However it's good enough for this example.

function urlify(text) {
  var urlRegex = /(https?://[^s]+)/g;
  return text.replace(urlRegex, function(url) {
    return '<a href="' + url + '">' + url + '</a>';
  })
  // or alternatively
  // return text.replace(urlRegex, '<a href="$1">$1</a>')
}

var text = 'Find me at http://www.example.com and also at http://stackoverflow.com';
var html = urlify(text);

console.log(html)

// html now looks like:
// "Find me at <a href="http://www.example.com">http://www.example.com</a> and also at <a href="http://stackoverflow.com">http://stackoverflow.com</a>"

总之试试:

$$('#pad dl dd').each(function(element) {
    element.innerHTML = urlify(element.innerHTML);
});

这篇关于使用 JavaScript 检测文本中的 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆