JSoup .followRedirects(true)不起作用 [英] JSoup .followRedirects(true) does not work

查看:100
本文介绍了JSoup .followRedirects(true)不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不关注(或至少不获取整个页面的内容),如何解决?

Does not follow (or at least does not get entire page content), how to solve that ?

我认为没有客户端重定向...

There is no client side redirects I presume ...

    <meta http-equiv ...

stackoverflow http-equiv

在我从中得出的结论内:

inside what I get down from this:

       Document doc1 = Jsoup.connect("http://e-uprava.gov.si/e-uprava/oglasnadeska.htm")
       .header("Accept-Encoding", "gzip, deflate")
       .userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
       .ignoreContentType(true)               
       .ignoreHttpErrors(true)
       .followRedirects(true)
       .timeout(600000)
       .maxBodySize(0)/*unlimited body size*/
       .get();

.

String url = "http://e-uprava.gov.si/e-uprava/oglasnadeska.htm";
final Connection connection = Jsoup.connect(url).timeout(10000);
final Response response = connection.execute();
final int status = response.statusCode();
System.out.println(status);

状态= 200

那是

div class ="subpage-container ...

div class="subpage-container ...

充满了我在浏览器中看到的内容. 正在检查元和javascript重定向->没有可用的结果

is not filled with stuff that I see in browser. Checking for meta and javascript redirects --> no usable results

推荐答案

解释:

重定向不是问题,并且jsoup正确加载页面.

Redirect is not the problem and jsoup loads the page correctly.

问题是页面正在使用JavaScript动态加载您要查找的内容.虽然jsoup只是HTML解析器,但是您不能期望它执行JavaScript并获取数据.

The problem is that the page is using JavaScript to dynamically load the content that you're looking for. While jsoup is just HTML parser, you cannot expect from it executing JavaScript and fetching the data.

解决方案:

如果您在浏览器中打开此页面,并查看开发者工具,以查看该页面发出的所有请求,您肯定会找到这个的:

If you open this page in browser and look at developer tools for all request that this page makes, you'll certainly find this one:

其中包含您想要的所有数据.

Which contains all the data you want.

此解决方案并不理想,对页面的任何更改都可能破坏它.最好使用诸如 Selenium

This solution is not ideal and any changes to page can break it. It would be much better to use browser emulators such as Selenium or HtmlUnit

这篇关于JSoup .followRedirects(true)不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆