Nokogiri Xpath在< BR>之后检索文本在< TD>和< SPAN> [英] Nokogiri Xpath to retrieve text after <BR> within <TD> and <SPAN>

查看:89
本文介绍了Nokogiri Xpath在< BR>之后检索文本在< TD>和< SPAN>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下html,并想知道如何使用xpath检索所有信息:
- 名称(第一,最后)
- 昵称
- 电子邮件
- 送货地址...

I have the following html and like to know how to use xpath to retrieve all the info: - Name(first, last) - Nick Name - email - shipping address...

主要检索< BR> 后的文字。非常感谢。

Primarily, retrieve text after <BR>. Many Thanks in advance.

<table>
<tr>
<td valign="top" width="50%" align="left">
<span>Buyer</span><br/>FirstName LastName<br/>NickName<br/>First.Last@SomeCompany.com</td>

<tr><td valign="top" width="40%" align="left">
<span><span>Shipping address - </span><span>confirmed</span></span><br/>FirstName LastName<br/>Attn: FirstName<br/>1234 Main St.<br/>TheCity, TheState, 12345<br/>United States<br/></td>
</tr></table>

在我发布上述问题后,我了解到我可以做到这些,但看起来并不干净:

After I posted the above question, I learned that I can do these, but does not look clean:

buyer = html.xpath("//span/text()[contains(., 'Buyer')]").first.parent 
buyer_name = buyer.next.next 
puts "Buyer's Full name: #{buyer_name.text}" 
buyer_nick = buyer_name.next.next 
puts "Buyer's Nick name: #{buyer_nick.text}" 
buyer_email = buyer_nick.next.next 
puts "Buyer's email: #{buyer_email.text}" 

现在我的问题是为什么html.xpath(// span / text()[contains(。,'Buyer')])返回TEXT本身而不是ELEMENT。再次感谢!!

My question now is why the html.xpath("//span/text()[contains(., 'Buyer')]") return the TEXT itself instead of the ELEMENT. Again, thanks!!

推荐答案

这里有一个简洁的方法:

Here's a concise way:

name, nick, email, *addr = doc.search('//td/text()[preceding-sibling::br]')

puts name, nick, email, "--", addr

XPath完全符合您的说法:所有文本节点在 br 之后。地址被混合到一个变量中,但如果你愿意,你可以单独获得组件。

The XPath does exactly what you stated: it takes all text nodes following a br. The address is slurped into one variable, but you can get the components separately if you want.

输出:

Output:

FirstName LastName
NickName
First.Last@SomeCompany.com
--
FirstName LastName
Attn: FirstName
1234 Main St.
TheCity, TheState, 12345
United States

这篇关于Nokogiri Xpath在&lt; BR&gt;之后检索文本在&lt; TD&gt;和&lt; SPAN&gt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆