使用XPath获取带有链接的段落文本 [英] Using XPath to get text of paragraph with links inside

查看:151
本文介绍了使用XPath获取带有链接的段落文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用XPath解析HTML页面,并希望获取某些特定段落的整个文本,包括链接文本.

I'm parsing HTML page with XPath and want to grab whole text of some specific paragraph, including text of links.

例如,我有以下段落:

<p class="main-content">
    This is sample paragraph with <a href="http://google.com">link</a> inside.
</p>

我需要得到以下文本作为结果:这是带有链接的示例段落",但是应用"//p[@class'main-content']/text()"只会给我这是带有链接的示例段落".

I need to get following text as result: "This is sample paragraph with link inside", however applying "//p[@class'main-content']/text()" gives me only "This is sample paragraph with inside".

能请您帮忙吗?谢谢.

推荐答案

要获取节点的整个文本内容,请使用

To get the whole text content of a node, use the string function:

string(//p[@class="main-content"])

请注意,这将获得一个字符串值.如果需要文本节点(由返回),则可以执行此操作.您需要进行全方位的搜索:

Note that this gets a string value. If you want text nodes (as returned by text()), you can do this. You need to search at all depths:

//p[@class="main-content"]//text()

这将返回三个文本节点:This is sample paragraph withlinkinside.

This returns three text nodes: This is sample paragraph with, link and inside.

这篇关于使用XPath获取带有链接的段落文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆