正则表达式有条件地用超链接替换Twitter主题标签 [英] Regex to conditionally replace Twitter hashtags with hyperlinks

查看:142
本文介绍了正则表达式有条件地用超链接替换Twitter主题标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个小的PHP脚本,以从用户供稿中获取最新的六个Twitter状态更新,并将其格式化以显示在网页上.作为此过程的一部分,我需要一个正则表达式替换来将主题标签重写为到search.twitter.com的超链接.最初我尝试使用:

<?php
$strTweet = preg_replace('/(^|\s)#(\w+)/', '\1#<a href="http://search.twitter.com/search?q=%23\2">\2</a>', $strTweet);
?>

(摘自 https://gist.github.com/445729 )

在测试过程中,我发现#test转换为Twitter网站上的链接,但#123却没有.在互联网上进行了一些检查并使用了各种标签之后,我得出了一个结论,即标签必须在其中包含字母字符或下划线以构成链接.仅包含数字字符的标签会被忽略(大概是为了避免链接像鲍勃先生那样的演讲,我最喜欢的幻灯片#3!"之类的东西).这会使上面的代码不正确,因为它将很高兴地将#123转换为链接.

我已经有一段时间没有做太多的正则表达式了,所以在我生疏的情况下,我想到了以下PHP解决方案:

<?php
$test = 'This is a test tweet to see if #123 and #4 are not encoded but #test, #l33t and #8oo8s are.';

// Get all hashtags out into an array
if (preg_match_all('/(^|\s)(#\w+)/', $test, $arrHashtags) > 0) {
  foreach ($arrHashtags[2] as $strHashtag) {
    // Check each tag to see if there are letters or an underscore in there somewhere
    if (preg_match('/#\d*[a-z_]+/i', $strHashtag)) {
      $test = str_replace($strHashtag, '<a href="http://search.twitter.com/search?q=%23'.substr($strHashtag, 1).'">'.$strHashtag.'</a>', $test);
    }
  }
}

echo $test;
?>

它有效;但是它的作用似乎相当漫长.我的问题是,是否有一个与gist.github相似的preg_replace,它将仅在不仅包含数字的情况下将条件标签重写为超链接?

解决方案

(^|\s)#(\w*[a-zA-Z_]+\w*)

PHP

$strTweet = preg_replace('/(^|\s)#(\w*[a-zA-Z_]+\w*)/', '\1#<a href="http://twitter.com/search?q=%23\2">\2</a>', $strTweet);

此正则表达式表示#,后跟0个或多个字符[a-zA-Z0-9_],后接字母字符或下划线(1个或多个),后跟0个或多个单词字符.

http://rubular.com/r/opNX6qC4sG <-在此处进行测试.

I'm writing a small PHP script to grab the latest half dozen Twitter status updates from a user feed and format them for display on a webpage. As part of this I need a regex replace to rewrite hashtags as hyperlinks to search.twitter.com. Initially I tried to use:

<?php
$strTweet = preg_replace('/(^|\s)#(\w+)/', '\1#<a href="http://search.twitter.com/search?q=%23\2">\2</a>', $strTweet);
?>

(taken from https://gist.github.com/445729)

In the course of testing I discovered that #test is converted into a link on the Twitter website, however #123 is not. After a bit of checking on the internet and playing around with various tags I came to the conclusion that a hashtag must contain alphabetic characters or an underscore in it somewhere to constitute a link; tags with only numeric characters are ignored (presumably to stop things like "Good presentation Bob, slide #3 was my favourite!" from being linked). This makes the above code incorrect, as it will happily convert #123 into a link.

I've not done much regex in a while, so in my rustyness I came up with the following PHP solution:

<?php
$test = 'This is a test tweet to see if #123 and #4 are not encoded but #test, #l33t and #8oo8s are.';

// Get all hashtags out into an array
if (preg_match_all('/(^|\s)(#\w+)/', $test, $arrHashtags) > 0) {
  foreach ($arrHashtags[2] as $strHashtag) {
    // Check each tag to see if there are letters or an underscore in there somewhere
    if (preg_match('/#\d*[a-z_]+/i', $strHashtag)) {
      $test = str_replace($strHashtag, '<a href="http://search.twitter.com/search?q=%23'.substr($strHashtag, 1).'">'.$strHashtag.'</a>', $test);
    }
  }
}

echo $test;
?>

It works; but it seems fairly long-winded for what it does. My question is, is there a single preg_replace similar to the one I got from gist.github that will conditionally rewrite hashtags into hyperlinks ONLY if they DO NOT contain just numbers?

解决方案

(^|\s)#(\w*[a-zA-Z_]+\w*)

PHP

$strTweet = preg_replace('/(^|\s)#(\w*[a-zA-Z_]+\w*)/', '\1#<a href="http://twitter.com/search?q=%23\2">\2</a>', $strTweet);

This regular expression says a # followed by 0 or more characters [a-zA-Z0-9_], followed by an alphabetic character or an underscore (1 or more), followed by 0 or more word characters.

http://rubular.com/r/opNX6qC4sG <- test it here.

这篇关于正则表达式有条件地用超链接替换Twitter主题标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆