如何使用Cookie与RCurl? [英] How do I use cookies with RCurl?
问题描述
我想写一个R包,通过REST API访问一些数据。但是,API不使用http验证,而是依靠Cookie来保存会话的凭据。
I am trying to write an R package that accesses some data via a REST API. The API, however, doesn't use http authentication, but rather relies on cookies to keep credentials with the session.
本质上,我想替换以下两个来自带有两个R函数的bash脚本的行:一个用于执行登录,并存储会话cookie,第二个用于获取数据。
Essentially, I'd like to replace the following two lines from a bash script with two R functions: One to perform the login, and store the session cookie, and the second to GET the data.
curl -X POST -c cookies.txt -d"username=xxx&password=yyy" http://api.my.url/login
curl -b cookies.txt http://api.my.url/data
我显然不理解RCurl如何使用curl选项。我的脚本现在有:
I'm clearly not understanding how RCurl works with curl options. My script as it stands has:
library(RCurl)
curl <- getCurlHandle()
curlSetOpt(cookiejar='cookies.txt', curl=curl)
postForm("http://api.my.url/login", username='xxx', password='yyy', curl=curl)
getURL('http://api.my.url/data", curl=curl)
最后的 getURL()
失败,并从服务器发出未登录消息,并且 postForm()$
The final getURL()
fails with a "Not logged in." message from the server, and after the postForm()
no cookies.txt
file exists.
推荐答案
在
In general you don't need to create a cookie file, unless you want to study the cookies.
实际上,web服务器使用代理数据,重定向和隐藏的帖子数据,但这应该有助于:
Given this, in real word, web servers use agent data, redirecting and hidden post data, but this should help:
library(RCurl)
#Set your browsing links
loginurl = "http://api.my.url/login"
dataurl = "http://api.my.url/data"
#Set user account data and agent
pars=list(
username="xxx"
password="yyy"
)
agent="Mozilla/5.0" #or whatever
#Set RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookies.txt", useragent = agent, followlocation = TRUE, curl=curl)
#Also if you do not need to read the cookies.
#curlSetOpt( cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)
#Post login form
html=postForm(loginurl, .params = pars, curl=curl)
#Go wherever you want
html=getURL(dataurl, curl=curl)
#Start parsing your page
matchref=gregexpr("... my regexp ...", html)
#... .... ...
#Clean up. This will also print the cookie file
rm(curl)
gc()
重要
除了用户名和密码,通常可以隐藏发布数据。要捕获它,你可能想要,例如。在Chrome中使用开发人员工具
( Ctrl Shift I ) - > 网络选项卡
,以显示发布字段名称和值。
Important
There can often be hidden post data, beyond username and password. To capture it you may want, e.g. in Chrome, to use Developer tools
(Ctrl Shift I) -> Network Tab
, in order to show the post field names and values.
这篇关于如何使用Cookie与RCurl?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!