在Java中,下载前可能会确定网页的大小? [英] In Java, it's possible determine the size of a web page before download?

查看:132
本文介绍了在Java中,下载前可能会确定网页的大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想确定一个网页的大小,所以,如果它大于一个数字(例如:5MB),我会下载与否。
我可以有这个信息吗?

I want determine the size of a web page, and so, if it is greater than a number (eg.:5MB), I will download it or not. Can I have this information?

推荐答案

你可以做一个体面的近似:

You can do a decent approximation with:

HttpURLConnection content = (HttpURLConnection) new URL("www.example.com").openConnection();
System.out.println(content.getContentLength());

但是,这只会告诉你要请求的具体资源的长度(例如HTML在URL的基地)。 您还需要浏览页面中的HTML,查看其引用的所有资源(来自其他网站,图像,视频等的脚本),并将其全部归结。

However, this will only tell you the length of the specific resource you're requesting (e.g. the HTML at the base of the URL). You will also need to go through the HTML in the page, look at all the resources that it references (scripts from other sites, images, video, etc.) and total them all up.

这将使您的总体尺寸相当接近,但即使这样,您也不会得到完美的计数,因为(a)并不是所有的URL都将返回此信息没有任何控制权,(b)取决于如何加载内容(例如通过AJAX呼叫隐藏详细信息),您将无法提前知道要下载的完整资源列表

That will get you fairly close to a total size, but even then you won't get a perfect count, because (a) not all URLs are going to return this information and you don't have any control over that, and (b) depending on how the content is loaded (such as through AJAX calls that hide the specifics) you won't be able to know ahead of time the complete list of resources to be downloaded.

或者,如果URL不返回结果,我认为Giacomo正在建议使用CounterInputStream。不是一个坏主意您可以将上述建议与CounterInputStream结合,以保持已发送总数的计数,并在达到指定的最大总转移大小时可能会停止传输。这样你基本上有一个预测的大小(说一个网站告诉你它将是3.3 MB),但是当你下载的时候,你发现它实际上是6 MB,还没有停止,并作出决定不要再下载了。

Alternatively, if a URL doesn't return a result, I think Giacomo was suggesting the use of a CounterInputStream. Not a bad idea. You could maybe combine the above suggestion with the CounterInputStream to keep a count of the total that has been sent, and potentially stop the transfer when it reaches a specified maximum total transfer size. That way you'd essentially have a predicted size (say a site tells you it's going to be 3.3 MB), but as you're downloading you find out that it's actually 6 MB and hasn't stopped yet, and make the decision to not download anymore than that.

这篇关于在Java中,下载前可能会确定网页的大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆