用于 HTTPS 抓取的 Jsoup Cookie [英] Jsoup Cookies for HTTPS scraping

查看:29
本文介绍了用于 HTTPS 抓取的 Jsoup Cookie的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用此站点在欢迎页面上收集我的用户名以学习 Jsoup 和 Android.使用以下代码

I am experimenting with this site to gather my username on the welcome page to learn Jsoup and Android. Using the following code

Connection.Response res = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
    .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username", "ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
    .method(Method.POST)
    .execute();
String sessionId = res.cookie(".ASPXAUTH");

Document doc2 = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
.cookie(".ASPXAUTH", sessionId)
.get();

我的 cookie (.ASPXAUTH) 总是以 NULL 结束.如果我在网络浏览器中删除此 cookie,我将失去连接.所以我确定这是正确的 cookie.另外,如果我更改代码

My cookie (.ASPXAUTH) always ends up NULL. If I delete this cookie in a webbrowser, I lose my connection. So I am sure it is the correct cookie. In addition, if I change the code

.cookie(".ASPXAUTH", "jkaldfjjfasldjf")  Using the correct values of course

我可以从这个页面抓取我的登录名.这也让我觉得我有正确的 cookie.那么,我的 cookie 怎么会出现 Null 呢?我的用户名和密码名称字段不正确吗?还有什么?

I am able to scrape my login name from this page. This also makes me think I have the correct cookie. So, how come my cookie comes up Null? Are my username and password name fields incorrect? Something else?

谢谢.

推荐答案

我知道我在这里晚了 10 个月.但是使用 Jsoup 的一个好选择是使用这段简单的代码:

I know I'm kinda late by 10 months here. But a good option using Jsoup is to use this easy peasy piece of code:

//This will get you the response.
Response res = Jsoup
    .connect("url")
    .data("loginField", "login@login.com", "passField", "pass1234")
    .method(Method.POST)
    .execute();

//This will get you cookies
Map<String, String> cookies = res.cookies();

//And this is the easieste way I've found to remain in session
Documente doc = Jsoup.connect("url").cookies(cookies).get();

虽然我仍然无法连接到某些网站,但我使用相同的基本代码连接到了很多网站.哦,在我忘记之前.. 我​​认为我的问题是 SSL 证书.你必须以一种我还没有完全弄清楚的方式来正确地管理它们.

Though I'm still having trouble connection to SOME websites, I connect to a whole lot of them with the same basic piece of code. Oh, and before I forget.. What I figured my problem is, is SSL certificates. You have to properly manage them in a way I still haven't quite figured out.

这篇关于用于 HTTPS 抓取的 Jsoup Cookie的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆