使用 https URL 登录后下载文件 [英] Downloading a file after login using a https URL

查看:128
本文介绍了使用 https URL 登录后下载文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试下载我有链接的 excel 文件,但我需要先登录该页面才能下载该文件.我已经通过 rvest、rcurl 和 httr 成功通过了登录页面,但是我在登录后下载文件非常困难.

I am trying to download an excel file, which I have the link to, but I am required to log in to the page before I can download the file. I have successfully passed the login page with rvest, rcurl and httr, but I am having an extremely difficult time downloading the file after I have logged in.

url <- "https://website.com/console/login.do"
download_url <- "https://website.com/file.xls"
session <- html_session(url)
form <- html_form(session)[[1]]

filled_form <- set_values(form,
                          userid = user,
                          password = pass)

## Save main page url
main_page <- submit_form(session, filled_form)

download.file(download_url, "./file.xls", method = "curl")

当我运行 download.file 命令时,该文件会在我的工作目录中弹出,但它不是我要下载的文件,实际上只是一个损坏的 .XLS 文件,没有数据.

When I run the download.file command, the file pops up in my working directory, but it is not the file I am trying to download, and is actually just a corrupted .XLS file with no data.

作为参考,如果我通过chrome登录网站,并在登录后将下载链接粘贴到浏览器窗口中,文件会自动开始下载.如果我在 IE 中执行相同操作,则会弹出文件下载对话框,询问我是否要保存文件.

For reference, if I log in to the website via chrome, and paste the download link into the browser window after I have logged in, the file automatically starts downloading. If I do the same in IE, the file download dialog box pops up and asks me if I want to save the file.

可能的相关信息:

  • 这适用于我在工作的计算机,其中禁用了 cookie,因此我无法使用浏览器中的 cookie
  • 我尝试过基于 SO 上的大量帖子对 httr 和 rcurl 使用不同的方法,但无济于事

提前感谢您的时间!

推荐答案

/r/rstats 上的某个人实际上找到了这个问题的答案.我的问题的解决方案如下:

Someone on /r/rstats actually found the answer for this question. The solution for my problem was as follows:

#after login and submit_form do this:
download <- jump_to(main_page, download_url)

# write file to current working directory
writeBin(download$response$content, basename(download_url))

原始 SO 问题的链接

这篇关于使用 https URL 登录后下载文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆