如何通过 RCurl 使用 cookie? [英] How do I use cookies with RCurl?

查看:27
本文介绍了如何通过 RCurl 使用 cookie?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个通过 REST API 访问一些数据的 R 包.但是,该 API 不使用 http 身份验证,而是依靠 cookie 来保存会话凭据.

I am trying to write an R package that accesses some data via a REST API. The API, however, doesn't use http authentication, but rather relies on cookies to keep credentials with the session.

基本上,我想用两个 R 函数替换 bash 脚本中的以下两行:一个用于执行登录,并存储会话 cookie,第二个用于获取数据.

Essentially, I'd like to replace the following two lines from a bash script with two R functions: One to perform the login, and store the session cookie, and the second to GET the data.

curl -X POST -c cookies.txt -d"username=xxx&password=yyy" http://api.my.url/login
curl         -b cookies.txt                               http://api.my.url/data

我显然不明白 RCurl 如何处理 curl 选项.我的脚本目前有:

I'm clearly not understanding how RCurl works with curl options. My script as it stands has:

library(RCurl)
curl <- getCurlHandle()
curlSetOpt(cookiejar='cookies.txt', curl=curl)
postForm("http://api.my.url/login", username='xxx', password='yyy', curl=curl)
getURL('http://api.my.url/data", curl=curl)

最终的 getURL() 失败并显示未登录".来自服务器的消息,并且在 postForm() 之后不存在 cookies.txt 文件.

The final getURL() fails with a "Not logged in." message from the server, and after the postForm() no cookies.txt file exists.

推荐答案

一般情况下你不需要创建 cookie 文件,除非你想研究 cookie.

In general you don't need to create a cookie file, unless you want to study the cookies.

鉴于此,实际上,Web 服务器使用代理数据、重定向和隐藏的帖子数据,但这应该会有所帮助:

Given this, in real word, web servers use agent data, redirecting and hidden post data, but this should help:

library(RCurl)

#Set your browsing links 
loginurl = "http://api.my.url/login"
dataurl  = "http://api.my.url/data"

#Set user account data and agent
pars=list(
     username="xxx"
     password="yyy"
)
agent="Mozilla/5.0" #or whatever 

#Set RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookies.txt",  useragent = agent, followlocation = TRUE, curl=curl)
#Also if you do not need to read the cookies. 
#curlSetOpt(  cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)

#Post login form
html=postForm(loginurl, .params = pars, curl=curl)

#Go wherever you want
html=getURL(dataurl, curl=curl)

#Start parsing your page
matchref=gregexpr("... my regexp ...", html)

#... .... ...

#Clean up. This will also print the cookie file
rm(curl)
gc()

重要

除了用户名和密码之外,通常还有隐藏的帖子数据.要捕获它,您可能想要,例如在 Chrome 中,使用 Developer tools (Ctrl Shift I) -> Network Tab,以显示帖子字段名称和值.

Important

There can often be hidden post data, beyond username and password. To capture it you may want, e.g. in Chrome, to use Developer tools (Ctrl Shift I) -> Network Tab, in order to show the post field names and values.

这篇关于如何通过 RCurl 使用 cookie?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆