使用Jsoup通过HTTPS如何连接? [英] How to connect via HTTPS using Jsoup?

查看:5851
本文介绍了使用Jsoup通过HTTPS如何连接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

它的正常工作通过HTTP,但是当我尝试使用HTTPS源,它抛出以下异常:

  10-12 13:22:11.169:WARN / System.err的(332):javax.net.ssl​​.SSLHandshakeException:java.security.cert.CertPathValidatorException:信托认证锚找不到路径。
10-12 13:22:11.179:WARN / System.err的(332):在org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:477)
10-12 13:22:11.179:WARN / System.err的(332):在org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:328)
10-12 13:22:11.179:WARN / System.err的(332):在org.apache.harmony.luni.internal.net.www.protocol.http.HttpConnection.setupSecureSocket(HttpConnection.java:185)
10-12 13:22:11.179:WARN / System.err的(332):在org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeSslConnection(HttpsURLConnectionImpl.java:433)
10-12 13:22:11.189:WARN / System.err的(332):在org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeConnection(HttpsURLConnectionImpl.java:378)
10-12 13:22:11.189:WARN / System.err的(332):在org.apache.harmony.luni.internal.net.www.protocol.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:205)
10-12 13:22:11.189:WARN / System.err的(332):在org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:152)
10-12 13:22:11.189:WARN / System.err的(332):在org.jsoup.helper.HttpConnection $ Response.execute(HttpConnection.java:377)
10-12 13:22:11.189:WARN / System.err的(332):在org.jsoup.helper.HttpConnection $ Response.execute(HttpConnection.java:364)
10-12 13:22:11.189:WARN / System.err的(332):在org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
 

下面是相关的code:

 尝试{
    DOC = Jsoup.connect(HTTPS URL这里)得到()。
}赶上(IOException异常E){
    Log.e(SYS,coudnt获取HTML);
    e.printStackTrace();
}
 

解决方案

如果你想要做正确的方式,和/或需要处理的只是一个网站,那么你基本上需要抓住的SSL证书网站有问题,并将其导入到Java密钥存储区。这将导致你又设置为SSL信任存储使用Jsoup前JKS文件(或 java.net.URLConnection中)。

您可以抓住从你的网页浏览器的商店证书。让我们假设你使用Firefox。

  1. 在使用Firefox,这就是你的情况进入网站问题https://web2.uconn.edu/driver/old/timepoints.php?stopid=10
  2. 留在地址栏,你会看到uconn.edu蓝色(这表明一个有效的SSL证书)
  3. 点击详细信息就可以了,然后点击更多信息的按钮。
  4. 在出现的安全对话,点击的查看证书的按钮。
  5. 在出现的证书面板,进入的详细信息的标签。
  6. 点击该证书层次最深的项目,这是在这种情况下,web2.uconn.edu,最后单击导出的按钮。

现在你已经在 web2.uconn.edu.crt 文件。

接下来,打开命令提示符,并使用密钥工具命令在Java密钥存储导入(JRE的它的一部分):

 的keytool -import -v -file /path/to/web2.uconn.edu.crt -keystore /path/to/web2.uconn.edu.jks -storepass drowssap
 

-file 必须指向你刚刚下载该 .CRT 文件的位置。该 -keystore 必须指向的位置生成的 .jks 文件(你反过来要设置为SSL信任存储)。该 -storepass 是必需的,你可以输入任何密码,你想,只要​​它至少6个字符。

现在,你已经一个 web2.uconn.edu.jks 文件。您可以连接,最后才将其设置为SSL信任存储如下:

  System.setProperty(javax.net.ssl​​.trustStore中,/path/to/web2.uconn.edu.jks);
文献文件= Jsoup.connect(https://web2.uconn.edu/driver/old/timepoints.php?stopid=10)获得();
// ...
 


作为一个完全不同的选择,特别是当你需要处理多个站点(即你正在创建一个全球范围内的网络爬虫),那么你也可以指示Jsoup(基本上, java.net。 URLConnection的)盲目信任所有SSL证书。另见其他非信任的或配置错误的HTTPS站点处理这个答案的底部:<一href="http://stackoverflow.com/questions/2793150/using-java-net-urlconnection-to-fire-and-handle-http-requests/2793153#2793153">Using java.net.URLConnection中火和处理HTTP请求?

It's working fine over HTTP, but when I try and use an HTTPS source it throws the following exception:

10-12 13:22:11.169: WARN/System.err(332): javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:477)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:328)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.http.HttpConnection.setupSecureSocket(HttpConnection.java:185)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeSslConnection(HttpsURLConnectionImpl.java:433)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeConnection(HttpsURLConnectionImpl.java:378)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:205)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:152)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:377)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)

Here's the relevant code:

try {
    doc = Jsoup.connect("https url here").get();
} catch (IOException e) {
    Log.e("sys","coudnt get the html");
    e.printStackTrace();
}

解决方案

If you want to do it the right way, and/or you need to deal with only one site, then you basically need to grab the SSL certificate of the website in question and import it in your Java key store. This will result in a JKS file which you in turn set as SSL trust store before using Jsoup (or java.net.URLConnection).

You can grab the certificate from your webbrowser's store. Let's assume that you're using Firefox.

  1. Go to the website in question using Firefox, which is in your case https://web2.uconn.edu/driver/old/timepoints.php?stopid=10
  2. Left in the address bar you'll see "uconn.edu" in blue (this indicates a valid SSL certificate)
  3. Click on it for details and then click on the More information button.
  4. In the security dialogue which appears, click the View Certificate button.
  5. In the certificate panel which appears, go to the Details tab.
  6. Click the deepest item of the certificate hierarchy, which is in this case "web2.uconn.edu" and finally click the Export button.

Now you've a web2.uconn.edu.crt file.

Next, open the command prompt and import it in the Java key store using the keytool command (it's part of the JRE):

keytool -import -v -file /path/to/web2.uconn.edu.crt -keystore /path/to/web2.uconn.edu.jks -storepass drowssap

The -file must point to the location of the .crt file which you just downloaded. The -keystore must point to the location of the generated .jks file (which you in turn want to set as SSL trust store). The -storepass is required, you can just enter whatever password you want as long as it's at least 6 characters.

Now, you've a web2.uconn.edu.jks file. You can finally set it as SSL trust store before connecting as follows:

System.setProperty("javax.net.ssl.trustStore", "/path/to/web2.uconn.edu.jks");
Document document = Jsoup.connect("https://web2.uconn.edu/driver/old/timepoints.php?stopid=10").get();
// ...


As a completely different alternative, particularly when you need to deal with multiple sites (i.e. you're creating a world wide web crawler), then you can also instruct Jsoup (basically, java.net.URLConnection) to blindly trust all SSL certificates. See also section "Dealing with untrusted or misconfigured HTTPS sites" at the very bottom of this answer: Using java.net.URLConnection to fire and handle HTTP requests?

这篇关于使用Jsoup通过HTTPS如何连接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆