使用R从PHP网站读取数据 [英] Read data from a php website with R

查看:73
本文介绍了使用R从PHP网站读取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从这样的表中将数据导入到 R :

I would like to import data into R from a table like this:

http://www.rout.gr /index.php?name=Rout&file=results&year=2011

我尝试使用下面的线程建议的XML库,但是我什么也没得到.

I tried using XML library as suggested by the thread below but I couldn't get anything.

将html表刮入R数据帧使用XML包

推荐答案

该网站似乎发生了一些时髦的事情.除非您伪造用户代理,否则似乎不返回任何数据.即使这样,readHTMLTable的表现也不是很好,如果将整个doc传递给它,则会返回一个错误.阅读源代码后,您可以看到相关表的ID为table_results_r_1并将其隔离并将结果传递给作品:

There do seem to be some funky things going on with that site. It seems to return no data unless you fake the user-agent. Even then, readHTMLTable doesn't behave too well, returning an error if you pass it the whole doc. After reading the source, you can see that the relevant table has id table_results_r_1 and isolating that and passing the result through works:

library(XML)
library(httr)

theurl <- "http://www.rout.gr/index.php?name=Rout&file=results&year=2011"
doc <- htmlParse(GET(theurl, user_agent("Mozilla")))
results <- xpathSApply(doc, "//*/table[@id='table_results_r_1']")
results <- readHTMLTable(results[[1]])
rm(doc)

现在,您需要整理表的列名.

Now you'll need to tidy up the table column names.

这篇关于使用R从PHP网站读取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆