div类抓取 [英] div class scraping

查看：20 发布时间：2021/7/14 18:41:03 r web-scraping rvest

本文介绍了div类抓取的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用以下代码从以下网站抓取表格:

I am trying to scrape a table from the following web using the code below:

library(rvest)
library(tidyverse)
library(dplyr)

base<-'******************'

links<-read_html(base)%>%html_nodes(".v-data-table__wrapper")

但还没有运气.任何人都可以帮助我吗?

But no luck yet. Can anyone help me with this please?

推荐答案

页面源中原本没有表格.本页使用JS生成表格:

There's no table in the page source originally. This page uses JS to generate the table:

这个想法是运行JS代码来获取数据(你需要V8包):

The idea is to run the JS code to get the data (you will need the V8 package):

library(V8)
library(rvest)
js <- read_html("https://www.locate.ai/retail-tracker.html") %>%
  html_node(xpath = "//script[contains(., 'gon.data')]") %>% html_text()
ct <- V8::new_context()
ct$eval("var window = {}, gon = {};") # need to initialize variables first
ct$eval(js)
data <- ct$get("gon")
# mining the data
cities <- data$regions
retailbrands <- data$brands

结果:

> head(cities)
           region     change
1 Minneapolis, MN -0.7164120
2      Boston, MA -0.6337319
3  Washington, DC -0.6191386
4     Detroit, MI -0.5693641
5     Chicago, IL -0.5101856
6   Charlotte, NC -0.4810490

> head(retailbrands)
            brand     change
1      LA Fitness -0.6168534
2     Wells Fargo -0.5355715
3     Foot Locker -0.5211365
4     Ethan Allen -0.5096331
5     Clean Juice -0.5079978
6 Texas Roadhouse -0.4770344

这篇关于div类抓取的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

div类抓取 [英] div class scraping

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

div类抓取 [英] div class scraping

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭