将 url 表放入 `data.frame` R-XML-RCurl [英] get url table into a `data.frame` R-XML-RCurl

查看:29
本文介绍了将 url 表放入 `data.frame` R-XML-RCurl的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 url 表放入 data.frame.在其他示例中,我发现以下代码有效:

I'm trying to get the table of an url into a data.frame. In other examples I found the following code worked:

library(XML)
library(RCurl)
theurl <- "https://es.finance.yahoo.com/q/cp?s=BEL20.BR"
tables <- readHTMLTable(theurl)

正如警告所说,该表似乎不是 XML

As the warning says the table doesn't seem to be XML

警告信息:XML 内容似乎不是 XML:'https://es.finance.yahoo.com/q/cp?s=BEL20.BR'

或者,getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R") 有效但不知道如何提取表格.任何帮助将不胜感激.

Alternatively, getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R") works but don't know how to extract the table. Any help would be appreciated.

感谢@har07 使用 table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab 给出了输出,但仍然必须是过滤.

thanks to @har07 using table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab gives the output but still have to be filtered.

推荐答案

如果使用 getURL 获取文档内容,则可以获取表格.有时 readHTMLTable 无法获取内容.在这些情况下,建议尝试 getURL

You can get the table if you use getURL to get the document content. Sometimes readHTMLTable has trouble getting content. In those cases, it is recommended to try getURL

> library(XML)
> library(RCurl)
> URL <- getURL("https://es.finance.yahoo.com/q/cp?s=BEL20.BR")
> rt <- readHTMLTable(URL, header = TRUE)
> rt

您可能需要调整 header 参数和其他可能的参数,但表格在那里.

You might need to adjust the header argument and possibly others, but the tables are there.

这篇关于将 url 表放入 `data.frame` R-XML-RCurl的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆