PHP Xpath:获取所有包含needle的href值 [英] PHP Xpath : get all href values that contain needle

查看:121
本文介绍了PHP Xpath:获取所有包含needle的href值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用PHP Xpath尝试快速提取html页面中的某些链接.

Working with PHP Xpath trying to quickly pull certain links within a html page.

以下内容将在mypage.html上找到所有href链接: $nodes = $x->query("//a[@href]");

The following will find all href links on mypage.html: $nodes = $x->query("//a[@href]");

以下内容将找到说明与我的针头匹配的所有href链接: $nodes = $x->query("//a[contains(@href,'click me')]");

Whereas the following will find all href links where the description matches my needle: $nodes = $x->query("//a[contains(@href,'click me')]");

我要实现的目标是对href本身进行匹配,更具体地查找包含某些参数的url.是否可以在Xpath查询中实现?还是应该开始处理第一个Xpath查询的输出?

What I am trying to achieve is matching on the href itself, more specific finding url's that contain certain parameters. Is that possible within a Xpath query or should I just start manipulating the output from the first Xpath query?

推荐答案

不确定我是否正确理解了这个问题,但是第二个XPath表达式已经满足您的描述.它与A元素的文本节点不匹​​配,但与href属性匹配:

Not sure I understand the question correctly, but the second XPath expression already does what you are describing. It does not match against the text node of the A element, but the href attribute:

$html = <<< HTML
<ul>
    <li>
        <a href="http://example.com/page?foo=bar">Description</a>
    </li>
    <li>
        <a href="http://example.com/page?lang=de">Description</a>
    </li>
</ul>
HTML;

$xml  = simplexml_load_string($html);
$list = $xml->xpath("//a[contains(@href,'foo')]");

输出:

array(1) {
  [0]=>
  object(SimpleXMLElement)#2 (2) {
    ["@attributes"]=>
    array(1) {
      ["href"]=>
      string(31) "http://example.com/page?foo=bar"
    }
    [0]=>
    string(11) "Description"
  }
}

如您所见,返回的NodeList仅包含A元素,而href包含foo(据我所知,这是您要查找的内容).它包含整个元素,因为XPath转换为获取具有href属性包含foo 的所有A元素.然后,您将使用

As you can see, the returned NodeList contains only the A element with href containing foo (which I understand is what you are looking for). It contans the entire element, because the XPath translates to Fetch all A elements with href attribute containing foo. You would then access the attribute with

echo $list[0]['href'] // gives "http://example.com/page?foo=bar"

如果只想返回属性本身,则必须这样做

If you only want to return the attribute itself, you'd have to do

//a[contains(@href,'foo')]/@href

请注意,在SimpleXml中,这将返回SimpleXml元素:

Note that in SimpleXml, this would return a SimpleXml element though:

array(1) {
  [0]=>
  object(SimpleXMLElement)#3 (1) {
    ["@attributes"]=>
    array(1) {
      ["href"]=>
      string(31) "http://example.com/page?foo=bar"
    }
  }
}

但是您现在可以通过以下方式输出网址

but you can output the URL now by

echo $list[0] // gives "http://example.com/page?foo=bar"

这篇关于PHP Xpath:获取所有包含needle的href值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆