preg_replace 中的正则表达式检测 url 格式并提取元素 [英] Regex in preg_replace to detect url format and extract elements

查看:52
本文介绍了preg_replace 中的正则表达式检测 url 格式并提取元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用嵌入的 Flash 对象替换某些用户输入的 URL...而且我在使用正则表达式匹配 url 时遇到问题...我认为主要是因为这些 URL 对 SEO 友好因此解析起来有点困难

I need to replace certain user-entered URLs with embedded flash objects...and I'm having trouble with a regex that I'm using to match the url...I think mainly because the URLs are SEO-friendly and therefore a bit more difficult to parse

URL structure: http://www.site.com/item/item_title_that_can_include_1('_etc-32CHARACTERALPHANUMERICGUID

我需要同时检测该格式的 URL 的匹配项并捕获 32CHARACTERALPHANUMERICGUID,它始终放置在 URL 中的 - 之后

I need to both detect a match of an URL in that format and capture the 32CHARACTERALPHANUMERICGUID which is always placed after the - in the url

像这样:

$ret = preg_replace('#http://www\.site\.com/item/([^-])-([a-zA-Z0-9]+)#','<embed>itemid=$2</embed>', $ret);

出于某种原因,上面没有找到与指定格式的 URL 匹配的内容.我是正则表达式的新手,所以我想我错过了一些相当明显的东西.

For some reason, the above does not find a match for an URL in the specified format. I'm new to regexes, so I think I'm missing something fairly obvious.

推荐答案

你应该看看 parse_url().

You should check out parse_url().

检查结果 - 它用于解析 URL.您将能够从返回的令牌中提取您需要的数据.

Examine the results - it was made for parsing URLs. You'll be able to extract the data you require from the tokens returned.

如果你对正则表达式很着迷,试试这个...

If you are regex crazy, try this...

/^http:\/\/www\.site\.com\/item\/[^-]*\-([a-zA-Z0-9]{32})$/

你的例子就差不多了,但是...

Your example is almost there, but...

  • 当你做非字符范围时,即[^-],你仍然需要一个量词.我放置了 *,或 0 或更多.
  • 您似乎没有使用项目标题,所以我们不会费心去捕捉它.
  • 如果字符串总是完全一样,您应该使用开始 (^) 和结束 ($) 锚点.
  • 您说 GUID 是 32 个字符,因此我们不妨使用 {32} 量词明确说明这一点.
  • When you do the not character range, i.e. [^-], you still need a quantifier. I placed *, or 0 or more.
  • You don't seem to use the item title, so we won't bother capturing it.
  • You should use beginning (^) and end ($) anchors if the string is always exactly like that.
  • You say the GUID is 32 chars, so we may as well explicitly state that with the {32} quantifier.

这篇关于preg_replace 中的正则表达式检测 url 格式并提取元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆