如何匹配文本节点，然后使用XPath跟随父节点 [英] How to match a text node then follow parent nodes using XPath

查看：153 发布时间：2018/6/15 10:19:59 python html xpath lxml

本文介绍了如何匹配文本节点，然后使用XPath跟随父节点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图用XPath解析一些HTML。遵循下面的简化XML示例，我想匹配字符串'Text 1'，然后获取相关 content 节点的内容。

 < doc> 
< block> 
< title>文字1< / title> 
< content>我要的东西< / content> 
< / block> 
 
< block> 
< title>文字2< / title> 
< content>我不想要的东西< / content> 
< / block> 
< / doc>

我的Python代码抛出一个不稳定的结果：

 >>> from lxml import etree 
>>> 
>>> tree = etree.XML（< doc>< block>< title> Text 1< / title>< content> Stuff 
我要< / content>< / block>< block><< ;标题>文本2< /标题><内容>材料< / content>< / block>< / doc>）
>>> 
>>> ＃获取所有标题
 ... tree.xpath（'// title / text（）'）
 ['Text 1'，'Text 2'] 
>>> 
>>> ＃match'Text 1'
 ... tree.xpath（'// title / text（）=Text 1'）
 True 
>>> 
>>> ＃跟随所选节点的父节点
 ... tree.xpath（'// title / text（）/../..// text（）'）
 ['Text 1'，'Stuff我想'，'文字2'，'我不想要的东西'] 
>>> 
>>> ＃跟随选定节点的父节点
 ... tree.xpath（'// title / text（）=Text 1/../..// text（）'）
 Traceback最近的调用最后）：
文件< stdin>，第1行，位于< module> 
在lxml.etree._Element.xpath（src / 
 lxml / lxml.etree.c：14542）中的文件lxml.etree.pyx，第1330行，
文件xpath.pxi ，第287行，在lxml.etree.XPathElementEvaluator .__ ca 
 ll__（src / lxml / lxml.etree.c：90093）
文件xpath.pxi，第209行，位于lxml.etree。 _XPathEvaluatorBase._handl 
 e_result（src / lxml / lxml.etree.c：89446）
文件xpath.pxi，第194行，位于lxml.etree._XPathEvaluatorBase._raise 
 _eval_error（src /lxml/lxml.etree.c:89281）
 lxml.etree.XPathEvalError：无效类型

这可能在XPath中吗？我需要以不同的方式表达自己想要做的事情吗？

你想要那个吗？ b
$ b
// title [text（）='Text 1'] /../ content / text（）

I'm trying to parse some HTML with XPath. Following the simplified XML example below, I want to match the string 'Text 1', then grab the contents of the relevant content node.
<doc> <block> <title>Text 1</title> <content>Stuff I want</content> </block> <block> <title>Text 2</title> <content>Stuff I don't want</content> </block> </doc>
My Python code throws a wobbly:
>>> from lxml import etree >>> >>> tree = etree.XML("<doc><block><title>Text 1</title><content>Stuff I want</content></block><block><title>Text 2</title><content>Stuff I d on't want</content></block></doc>") >>> >>> # get all titles ... tree.xpath('//title/text()') ['Text 1', 'Text 2'] >>> >>> # match 'Text 1' ... tree.xpath('//title/text()="Text 1"') True >>> >>> # Follow parent from selected nodes ... tree.xpath('//title/text()/../..//text()') ['Text 1', 'Stuff I want', 'Text 2', "Stuff I don't want"] >>> >>> # Follow parent from selected node ... tree.xpath('//title/text()="Text 1"/../..//text()') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "lxml.etree.pyx", line 1330, in lxml.etree._Element.xpath (src/ lxml/lxml.etree.c:14542) File "xpath.pxi", line 287, in lxml.etree.XPathElementEvaluator.__ca ll__ (src/lxml/lxml.etree.c:90093) File "xpath.pxi", line 209, in lxml.etree._XPathEvaluatorBase._handl e_result (src/lxml/lxml.etree.c:89446) File "xpath.pxi", line 194, in lxml.etree._XPathEvaluatorBase._raise _eval_error (src/lxml/lxml.etree.c:89281) lxml.etree.XPathEvalError: Invalid type
Is this possible in XPath? Do I need to express what I want to do in a different way?
解决方案
Do you want that?
//title[text()='Text 1']/../content/text()

这篇关于如何匹配文本节点，然后使用XPath跟随父节点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何匹配文本节点，然后使用XPath跟随父节点 [英] How to match a text node then follow parent nodes using XPath

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何匹配文本节点，然后使用XPath跟随父节点 [英] How to match a text node then follow parent nodes using XPath

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭