无法到达 css 节点 [英] trouble reaching a css node
问题描述
从此页面:http://www.beta.inegi.org.mx/app/buscador/default.html?q=e15a61a
我正在尝试检索此网址:http://www.beta.inegi.org.mx/app/biblioteca/ficha.html?upc=702825720599,
i'm trying to retrieve this url: http://www.beta.inegi.org.mx/app/biblioteca/ficha.html?upc=702825720599,
我尝试通过 css 选择器和 xpath(在 Web 开发人员选项卡中右键单击复制)访问它,但是,我只得到一个 {xml_nodeset (0)]
I've tried to reach it through the css selector and through the xpath (copied with right-click in web developer tab), however, I only get an {xml_nodeset (0)]
library(rvest)
url <- "http://www.beta.inegi.org.mx/app/buscador/default.html?q=e15a62b"
url %>% html_node("#snippet_row-tag_a_0")
url %>% html_node(xpath='//*[@id="snippet_row-tag_a_0"]')
推荐答案
你想抓取的项目是用 JavaScript 渲染的,你可以改用隐藏的 API:
The items you want to scrape are rendered with JavaScript, you can use the hidden API instead:
试试这个网址:http://www.beta.inegi.org.mx/app/api/buscador/busquedaTodos/E15A61A_A/RANKING/es
这将返回一个 JSON 字符串,您可以将其解析为 R 中的列表并提取您想要的信息.
This will return you a JSON string, you can parse it into a list in R and extract the information you want.
这篇关于无法到达 css 节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!