使用 XPath 查找节点内的最后一行 [英] Finding last line within node with XPath

查看:24
本文介绍了使用 XPath 查找节点内的最后一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有办法始终选择某个元素上方的节点的内容?

我想从中提取以下代码:

<h3>名称</h3>部分内容1<br/><br/>地址 12345<br/>09876 城市,国家<br/><span id="tel_number">12345</span>

这是查找 span 以上所有内容的 XPath:

//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()

现在,我需要的是一个 XPath,它始终选择跨度正上方的内容,而不选择其他内容(单行).如果(出于某种原因)跨度上方的 <br/> 丢失,它也应该有效.

希望有人能帮忙解决这个问题!

解决方案

我发现最好的检索邮编的方法如下:

data = page.search('(//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()').map{|数据| data.text.cleanup}数据.删除(")邮政编码 = data.last.match(/\d{5}/).to_s

从那里可以轻松检索选择之后或之前的所有内容.

I was wondering if there was a way to always select the content of a node above a certain element?

I have the following code that I want to extract from:

<div id="someDiv">
   <h3>Name</h3>
   Some content1
   <br/>
   <br/>
   Address 12345
   <br/>
   09876 City, Country
   <br/>
   <span id="tel_number">12345</span>
</div>

Here is the XPath that finds the content of everything above the span:

//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()

Now, what I need is an XPath that always selects the content right above the span and nothing else(a single line). It should also work if (for some reason) the <br/> above the span was missing.

Hope that somebody can help with that!

解决方案

I found that the best way to retrieve the postcode is as follows:

data = page.search('(//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()').map{|data| data.text.cleanup}
data.delete("")
postcode = data.last.match(/\d{5}/).to_s

From there its easy to retrieve everything after or before the selection.

这篇关于使用 XPath 查找节点内的最后一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆