如何使用PHP DOM从网页中提取关键字 [英] How do I extract keyword from webpage using PHP DOM
问题描述
这里是从网页提取的代码相同...
Here is a same of code I have extracted from a webpage...
<div class="user-details-narrow">
<div class="profileheadtitle">
<span class=" headline txtBlue size15">
Profession
</span>
</div>
<div class="profileheadcontent-narrow">
<span class="txtGrey size15">
administration
</span>
</div>
</div>
当显示在网页上时显示为职业管理。我想做的是提取专业,在这种情况下管理。但是,它并不像看起来那样简单,因为这段代码对于各种其他问题重复多次,例如
When displayed on the webpage it shows as "Profession administration". What I want to do is extract the profession, in this case "administration". However, it's not as simple as it might seem because this piece of code is repeated many times for various other questions, such as
<div class="user-details-narrow">
<div class="profileheadtitle">
<span class=" headline txtBlue size15">
Industry
</span>
</div>
<div class="profileheadcontent-narrow">
<span class="txtGrey size15">
banking
</span>
</div>
</div>
对一个好的解决方案有什么想法吗?
Any ideas on a good solution?
推荐答案
请不要使用正则表达式从页面获取节点值。
Please, do not use regular expressions for getting node values from a page.
PHP有一个非常漂亮的类名为 DOMDocument 。您只需以DOMDocument的形式获取网页:
PHP have a very nice class named DOMDocument. You can just fetch a page as DOMDocument:
$dom = new DOMDocument;
$dom->loadURL("http://test.de/page.html");
$finder = new DomXPath($doc);
$spaner = $finder->query("//*[contains(@class, 'size15')]");
echo $spaner->item(0)->nodeValue . "/" . $spaner->item(1)->nodeValue;
这篇关于如何使用PHP DOM从网页中提取关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!