Jsoup用于HTTPS抓取的Cookie [英] Jsoup Cookies for HTTPS scraping

查看:226
本文介绍了Jsoup用于HTTPS抓取的Cookie的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用此网站在欢迎页面上收集我的用户名以学习Jsoup和Android。使用以下代码

I am experimenting with this site to gather my username on the welcome page to learn Jsoup and Android. Using the following code

Connection.Response res = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
    .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username", "ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
    .method(Method.POST)
    .execute();
String sessionId = res.cookie(".ASPXAUTH");

Document doc2 = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
.cookie(".ASPXAUTH", sessionId)
.get();

我的cookie(.ASPXAUTH)总是NULL。如果我在webbrowser中删除这个cookie,我失去了我的连接。所以我相信它是正确的cookie。此外,如果我更改代码

My cookie (.ASPXAUTH) always ends up NULL. If I delete this cookie in a webbrowser, I lose my connection. So I am sure it is the correct cookie. In addition, if I change the code

.cookie(".ASPXAUTH", "jkaldfjjfasldjf")  Using the correct values of course

我可以从此页面中删除我的登录名。这也让我想我有正确的cookie。所以,我的cookie怎么会出现Null?我的用户名和密码名称字段是否不正确?还有什么?

I am able to scrape my login name from this page. This also makes me think I have the correct cookie. So, how come my cookie comes up Null? Are my username and password name fields incorrect? Something else?

感谢。

推荐答案

我知道我迟到了10个月。但是使用Jsoup的一个好的选择是使用这个容易的peasy代码:

I know I'm kinda late by 10 months here. But a good option using Jsoup is to use this easy peasy piece of code:

//This will get you the response.
Response res = Jsoup
    .connect("url")
    .data("loginField", "login@login.com", "passField", "pass1234")
    .method(Method.POST)
    .execute();

//This will get you cookies
Map<String, String> cookies = res.cookies();

//And this is the easieste way I've found to remain in session
Documente doc = Jsoup.connect("url").cookies(cookies).get();

虽然我仍然无法连接到某些网站,但我连接了很多相同的基本代码段。哦,在我忘记之前..我想我的问题是,是SSL证书。你必须以一种我还没有完全弄清楚的方式来正确地管理它们。

Though I'm still having trouble connection to SOME websites, I connect to a whole lot of them with the same basic piece of code. Oh, and before I forget.. What I figured my problem is, is SSL certificates. You have to properly manage them in a way I still haven't quite figured out.

这篇关于Jsoup用于HTTPS抓取的Cookie的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆