使用XPath获取带有链接的段落文本 [英] Using XPath to get text of paragraph with links inside
问题描述
我正在使用XPath解析HTML页面,并希望获取某些特定段落的整个文本,包括链接文本.
I'm parsing HTML page with XPath and want to grab whole text of some specific paragraph, including text of links.
例如,我有以下段落:
<p class="main-content">
This is sample paragraph with <a href="http://google.com">link</a> inside.
</p>
我需要得到以下文本作为结果:这是带有链接的示例段落",但是应用"//p[@class'main-content']/text()"
只会给我这是带有链接的示例段落".
I need to get following text as result: "This is sample paragraph with link inside", however applying "//p[@class'main-content']/text()"
gives me only "This is sample paragraph with inside".
能请您帮忙吗?谢谢.
推荐答案
To get the whole text content of a node, use the string
function:
string(//p[@class="main-content"])
请注意,这将获得一个字符串值.如果需要文本节点(由
Note that this gets a string value. If you want text nodes (as returned by text()
), you can do this. You need to search at all depths:
//p[@class="main-content"]//text()
这将返回三个文本节点:This is sample paragraph with
,link
和inside.
This returns three text nodes: This is sample paragraph with
, link
and inside.
这篇关于使用XPath获取带有链接的段落文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!