如何从<span>之间的html检索数据和</span> [英] how to retrieve data from html between <span> and </span>
问题描述
我想在亚马逊客户评论中获得 1 到 5 的比率.我检查了来源,发现这部分看起来像
<span style="margin-right:5px;"><span class="swSprite s_star_5_0 " title="5.0 颗星,共 5 颗星" ><span>5.0 颗星,共 5 颗星</span></跨度></span><span style="vertical-align:middle;"><b>Surface Pro 开箱即用</b>,<nobr>2013 年 10 月 5 日</nobr></span>;
我想从 5 颗星中获得 5.0 分
5.0 星,最多 5 颗星</span>
我如何使用 xpathSApply 来获取它?
谢谢!
我建议使用 selectr
包,它使用 css 选择器代替 xpath.
库(XML)doc <- htmlParse('<div style="margin-bottom:0.5em;"><span style="margin-right:5px;"><span class="swSprite s_star_5_0 " title="5.0 颗星,共 5 颗星" ><span>5.0 星,最多 5 颗星</span></span></span><span style="vertical-align:middle;"><b>Surface Pro 开箱即用</b>,<nobr>2013 年 10 月 5 日</nobr></span></div>', asText = TRUE)图书馆(选择器)xmlValue(querySelector(doc, 'div > span > span > span'))
更新:如果你想使用 xpath
,你可以使用 selectr
中的 css_to_xpath
函数来找出合适的 xpath 命令,在这种情况下,结果是
"descendant-or-self::div/span/span/span"
I want to get the rate that is from 1 to 5 in amazon customer reviews. I check the source, and find this part looks as
<div style="margin-bottom:0.5em;">
<span style="margin-right:5px;"><span class="swSprite s_star_5_0 " title="5.0 out of 5 stars" ><span>5.0 out of 5 stars</span></span> </span>
<span style="vertical-align:middle;"><b>Works great right out of the box with Surface Pro</b>, <nobr>October 5, 2013</nobr></span>
</div>
I want to get 5.0 out of 5 stars from
<span>5.0 out of 5 stars</span></span> </span>
how can i use xpathSApply to get it?
Thank you!
I would recommend using the selectr
package, which uses css selectors in place of xpath.
library(XML)
doc <- htmlParse('
<div style="margin-bottom:0.5em;">
<span style="margin-right:5px;">
<span class="swSprite s_star_5_0 " title="5.0 out of 5 stars" >
<span>5.0 out of 5 stars</span></span> </span>
<span style="vertical-align:middle;">
<b>Works great right out of the box with Surface Pro</b>,
<nobr>October 5, 2013</nobr></span>
</div>', asText = TRUE
)
library(selectr)
xmlValue(querySelector(doc, 'div > span > span > span'))
UPDATE: If you are looking to use xpath
, you can use the css_to_xpath
function in selectr
to figure out the appropriate xpath command, which in this case turns out to be
"descendant-or-self::div/span/span/span"
这篇关于如何从<span>之间的html检索数据和</span>的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!