如何使用Cookie与RCurl? [英] How do I use cookies with RCurl?

查看:135
本文介绍了如何使用Cookie与RCurl?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想写一个R包,通过REST API访问一些数据。但是,API不使用http验证,而是依靠Cookie来保存会话的凭据。

I am trying to write an R package that accesses some data via a REST API. The API, however, doesn't use http authentication, but rather relies on cookies to keep credentials with the session.

本质上,我想替换以下两个来自带有两个R函数的bash脚本的行:一个用于执行登录,并存储会话cookie,第二个用于获取数据。

Essentially, I'd like to replace the following two lines from a bash script with two R functions: One to perform the login, and store the session cookie, and the second to GET the data.

curl -X POST -c cookies.txt -d"username=xxx&password=yyy" http://api.my.url/login
curl         -b cookies.txt                               http://api.my.url/data



我显然不理解RCurl如何使用curl选项。我的脚本现在有:

I'm clearly not understanding how RCurl works with curl options. My script as it stands has:

library(RCurl)
curl <- getCurlHandle()
curlSetOpt(cookiejar='cookies.txt', curl=curl)
postForm("http://api.my.url/login", username='xxx', password='yyy', curl=curl)
getURL('http://api.my.url/data", curl=curl)

最后的 getURL()失败,并从服务器发出未登录消息,并且 postForm()

The final getURL() fails with a "Not logged in." message from the server, and after the postForm() no cookies.txt file exists.

推荐答案

一般来说,你不需要创建一个cookie文件,除非你想学习这些cookie。

In general you don't need to create a cookie file, unless you want to study the cookies.

实际上,web服务器使用代理数据,重定向和隐藏的帖子数据,但这应该有助于:

Given this, in real word, web servers use agent data, redirecting and hidden post data, but this should help:

library(RCurl)

#Set your browsing links 
loginurl = "http://api.my.url/login"
dataurl  = "http://api.my.url/data"

#Set user account data and agent
pars=list(
     username="xxx"
     password="yyy"
)
agent="Mozilla/5.0" #or whatever 

#Set RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookies.txt",  useragent = agent, followlocation = TRUE, curl=curl)
#Also if you do not need to read the cookies. 
#curlSetOpt(  cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)

#Post login form
html=postForm(loginurl, .params = pars, curl=curl)

#Go wherever you want
html=getURL(dataurl, curl=curl)

#Start parsing your page
matchref=gregexpr("... my regexp ...", html)

#... .... ...

#Clean up. This will also print the cookie file
rm(curl)
gc()



重要



除了用户名和密码,通常可以隐藏发布数据。要捕获它,你可能想要,例如。在Chrome中使用开发人员工具 Ctrl Shift I ) - > 网络选项卡,以显示发布字段名称和值。

Important

There can often be hidden post data, beyond username and password. To capture it you may want, e.g. in Chrome, to use Developer tools (Ctrl Shift I) -> Network Tab, in order to show the post field names and values.

这篇关于如何使用Cookie与RCurl?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆