从ASPX在HTTPS网站上获得,而不是CSV R下载 [英] R download from aspx in https getting website instead of CSV

查看:235
本文介绍了从ASPX在HTTPS网站上获得,而不是CSV R下载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

警告:Newbe在这里。我想AP preciate一些指导。我试图做投资要学会用R进行automatizing下载。

warning: Newbe here. I would appreciate some guidance. I am trying to do the investment to learn how to use R for automatizing downloads.

我需要什么:
为了从本网站所有县及报告期下载页岩气井数据:
<一href=\"https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCounty.aspx\" rel=\"nofollow\">https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCounty.aspx
(注意输入时,该协议可能会问,不是什么大不了的)

What I need: To download data on shale gas wells from this website for all counties and reporting periods: https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCounty.aspx (Note that agreement might be asked when entering, not a big deal)

我能到列出所有我要下载的CSV文件页面。不幸的是,网站具有与上述相同的地址。 (你可以尝试选择一个县和一个报告期,看看自己)

I can get to the page that lists all the CSV files I want to download. Unfortunately the site has the same address as above. (You can try to choose a county and a reporting period and see for yourself)

然而,在该页面一次,即激活CSV下载链接中列出。对于他们每个人是这样的:
<一href=\"https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY\" rel=\"nofollow\">https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY

However once in that page, the links that activate the CSV downloads are listed. For each of them is something like this: https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY

我曾尝试:

library(downloader)

download ("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY",
          destfile="Prod_AUG15_Allegheny.csv")

我按照什么在这里做另一个人:
<一href=\"http://stackoverflow.com/questions/32132344/download-documents-from-aspx-web-page-in-r\">Download从R中的aspx页面文件

问题:
此命令的网站保存的CSV文件,而不是。

The problem: This command saves the website instead of the csv file.

trying URL 'https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY'
Content type 'text/html; charset=utf-8' length 11592 bytes (11 Kb)
opened URL
downloaded 11 Kb

问题:
难道是我的网页是一个HTTPS而不是http有关?
如何解决,或其他职位是相关的任何指导?
(我能找到的ASPX下载,但没有帮助的一些职位)

The question: Is it related with my page being a https instead of http? Any guidance on how to solve it or other posts that are relevant? (I could find some posts on aspx downloads but nothing helpful)

在此先感谢

推荐答案

@hrbrmstr它的工作!没有办法的办法,我想在beggining但RSelenium我可以点击按钮接受该协议,实际上打开下载链接。

@hrbrmstr It worked! Not the way I wanted at the beggining but with RSelenium I could click the button for accepting the agreement and actually open the download link.

下面是code(很简单,但我花了一整天找出来,什么是耻辱):

Here is the code (Is simple but took me all day to find out, what a shame):

# Using RSelenium to save file
##Installing the package if needed
install.packages("RSelenium")
##Activating 
library("RSelenium")
checkForServer()
startServer()
#I had to start the server manually!
remDr <- remoteDriver()
remDr
remDr$open()
#open website and accepting conditions
remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Welcome/Agreement.aspx")
AgreeButton<-remDr$findElement(using = 'id', value="MainContent_AgreeButton")
AgreeButton$highlightElement()
AgreeButton$clickElement()

remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY")

不过!我不能够保存csv文件:-(。我知道我需要一个命令链接另存为...但我问这在有关RSelenium另一个话题。

However!! I am not able to save the csv file :-(. I know I need a command for "Save link as..." But I am asking this in another topic related to RSelenium.

将编辑答案时,我找到了!

Will Edit the answer when I find out!

这篇关于从ASPX在HTTPS网站上获得,而不是CSV R下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆