一个PHP的HTML解析器，可以让我做类选择并获得父节点 [英] A PHP HTML parser that lets me do class select and get parent nodes

查看：105 发布时间：2018/6/25 17:50:14 php html screen-scraping

本文介绍了一个PHP的HTML解析器，可以让我做类选择并获得父节点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此，我处于一种使用PHP来抓取网站的情况，我需要能够基于它的CSS类获取节点。我需要得到一个没有id属性但有一个css类的ul标签。我只需要在里面获得只包含特定锚标签的li标签，而不是所有li标签。

我查看了DOMDocument，Zend_Dom，并且都没有这两个要求，类选择和dom遍历（特别是对父母的升序）。 你可以使用 querypath ，然后类似这样的工作：

  htmlqp（$ html） - > find（ul.class） - > not（＃id）
  - > find（'li a [href * ='specific']'） - > parent（）
 //然后对其进行foreach或使用 - > writeHTML（）进行提取

请参阅 http： //api.querypath.org/docs/class_query_path.html 。

（遍历更容易，如果您不使用fiddly DOM文档。）

So I'm in a situation where I am scraping a website with PHP and I need to be able to get a node based on it's css class. I need to get a ul tag that doesn't have an id attribute but does have a css class. I, then need to get only li tags inside it which contain specific anchor tags, not all the li tags.

I've looked through DOMDocument, Zend_Dom, and neither have both of the requirements, class selections and dom traversal(specifically ascending to parents).
解决方案
You could use querypath and then something like this might work:
htmlqp($html)->find("ul.class")->not("#id") ->find('li a[href*="specific"]')->parent() // then foreach over it or use ->writeHTML() for extraction
See http://api.querypath.org/docs/class_query_path.html for the API.

(Traversing is much easier, if you don't use the fiddly DOMDocument.)

这篇关于一个PHP的HTML解析器，可以让我做类选择并获得父节点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

一个PHP的HTML解析器，可以让我做类选择并获得父节点 [英] A PHP HTML parser that lets me do class select and get parent nodes

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

一个PHP的HTML解析器，可以让我做类选择并获得父节点 [英] A PHP HTML parser that lets me do class select and get parent nodes

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭