网页抓取足球数据什么都不返回 [英] Webscraping soccer data returns nothing

查看：61 发布时间：2021/7/14 18:42:10 r rvest

本文介绍了网页抓取足球数据什么都不返回的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从网站上抓取比赛结果表 https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018

我正在使用带有以下代码的 rvest 包:

库(rvest)url.tournament <-https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018"df.tournament <- read_html(url.tournament) %>%html_nodes(xpath='//*[@id="tournament-fixture-wrapper"]') %>%html_nodes("表")html_table()

虽然没有提取元素.

解决方案

查看网站的源代码，您可以看到该表实际上并不存在于 HTML 源代码中——它是使用 JavaScript 动态生成的.这就是为什么您的 XPath 查询返回一个空的

在这种情况下，您因此不能依赖 {rvest}，您需要使用动态抓取工具，例如 {RSelenium}，可以解释 JavaScript.

I would like to scrape the match result table from the website https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018

I m using rvest package with following code:

library(rvest)

url.tournament <- "https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018"
df.tournament <- read_html(url.tournament) %>%
                  html_nodes(xpath='//*[@id="tournament-fixture-wrapper"]') %>%
                  html_nodes("table")
                  html_table()

while no element is extracted.

解决方案

Looking at the website’s source code you can see that the table doesn’t actually exist in the HTML source — it’s dynamically generated using JavaScript. That’s why your XPath query returns an empty <div>.

You consequently can’t rely on {rvest} in this case, you need to use a dynamic scraper such as {RSelenium}, which can interpret JavaScript.

这篇关于网页抓取足球数据什么都不返回的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

网页抓取足球数据什么都不返回 [英] Webscraping soccer data returns nothing

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

网页抓取足球数据什么都不返回 [英] Webscraping soccer data returns nothing

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭