如何使用Jsoup解析相对URL? [英] How to resolve relative url with Jsoup?
问题描述
我对Jsoup有问题.
Hi I have a problem with Jsoup.
我刮了一页并获得了很多网址.其中一些是相对网址,例如:"../index.php"
,"../admin"
,"../details.php"
.
I scrape a page and get a lot of urls. Some of those are relative urls like: "../index.php"
, "../admin"
, "../details.php"
.
我使用attr("abs:href")
来获取绝对URL,但是此链接的呈现方式类似于www.domain.com/../admin.php
I use attr("abs:href")
to get the absolute url, but this links are rendered like www.domain.com/../admin.php
我想知道这是否是一个错误.
I would like to know if this is a bug.
有没有办法用jsoup获得真正的绝对路径?我该如何解决?
Is there a way to get the real absolute path with jsoup? how can I solve this?
我也尝试过使用absurl("href")
,但是没有用.
I have tried also with absurl("href")
, but not working.
推荐答案
也不错的选择是使用abs:href或abs:src属性:
also a good option is to use the abs:href or abs:src attributes:
String relHref = link.attr("href"); // == "/"
String absHref = link.attr("abs:href"); // "http://jsoup.org/"
这里也有描述: http://jsoup.org/cookbook/extracting-data/working-with-urls
这篇关于如何使用Jsoup解析相对URL?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!