网络抓取Yahoo!中的关键统计数据R金融 [英] Web scraping of key stats in Yahoo! Finance with R
问题描述
有人从Yahoo!抓取数据方面经验丰富吗?带R?的财务关键统计页面?我熟悉使用rvest
包中的read_html
,html_nodes()
和html_text()
直接从html抓取数据.但是,此网页 MSFT关键统计有点复杂,我不确定是否所有统计信息都保存在XHR,JS或Doc中.我猜数据存储在JSON中.如果有人知道使用R提取和解析此网页数据的好方法,请回答我的问题,在此先感谢您!
Is anyone experienced in scraping data from the Yahoo! Finance key statistics page with R? I am familiar scraping data directly from html using read_html
, html_nodes()
, and html_text()
from rvest
package. However, this web page MSFT key stats is a bit complicated, I am not sure if all the stats are kept in XHR, JS, or Doc. I am guessing the data is stored in JSON. If anyone knows a good way to extract and parse data for this web page with R, kindly answer my question, great thanks in advance!
或者,如果有更便捷的方法通过quantmod
或Quandl
提取这些指标,请告诉我,这将是一个非常好的解决方案!
Or if there is a more convenient way to extract these metrics via quantmod
or Quandl
, kindly let me know, that would be a extremely good solution!
推荐答案
我很早以前就放弃了使用Excel. R绝对是解决此类问题的方法.
I gave up on Excel a long time ago. R is definitely the way to go for things like this.
library(XML)
stocks <- c("AXP","BA","CAT","CSCO")
for (s in stocks) {
url <- paste0("http://finviz.com/quote.ashx?t=", s)
webpage <- readLines(url)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")
# ASSIGN TO STOCK NAMED DFS
assign(s, readHTMLTable(tableNodes[[9]],
header= c("data1", "data2", "data3", "data4", "data5", "data6",
"data7", "data8", "data9", "data10", "data11", "data12")))
# ADD COLUMN TO IDENTIFY STOCK
df <- get(s)
df['stock'] <- s
assign(s, df)
}
# COMBINE ALL STOCK DATA
stockdatalist <- cbind(mget(stocks))
stockdata <- do.call(rbind, stockdatalist)
# MOVE STOCK ID TO FIRST COLUMN
stockdata <- stockdata[, c(ncol(stockdata), 1:ncol(stockdata)-1)]
# SAVE TO CSV
write.table(stockdata, "C:/Users/your_path_here/Desktop/MyData.csv", sep=",",
row.names=FALSE, col.names=FALSE)
# REMOVE TEMP OBJECTS
rm(df, stockdatalist)
这篇关于网络抓取Yahoo!中的关键统计数据R金融的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!