InternetOpenUrl仅在下载整个HTTP响应后返回 [英] InternetOpenUrl only returns after entire HTTP response is downloaded

查看:193
本文介绍了InternetOpenUrl仅在下载整个HTTP响应后返回的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用WinINET编写一个下载文件实用程序,并注意到(特别是在大型下载中),WinINET InternetOpenUrl()调用仅在整个HTTP响应后返回已经下载了。

I am writing a download file utility using WinINET, and have noticed (especially on large downloads), that the WinINET InternetOpenUrl() call only returns after the entire HTTP response has been downloaded.

我通过使用Charles代理工具以及使用WireShark确认了这一点,并注意到下载完全完成,然后WinINET通知我的代码。

I confirmed this by using the Charles proxy tool, as well as using WireShark, and noticed that the download completes entirely and only then does WinINET notify my code.

一些简化(同步)​​代码:

Some simplified (synchronous) code:

hInt = InternetOpen(USER_AGENT_NAME, INTERNET_OPEN_TYPE_PRECONFIG, 
                    NULL, NULL, 0);
DWORD dwRequestFlags = INTERNET_FLAG_NO_UI   // no UI please
            |INTERNET_FLAG_NO_AUTH           // don't authenticate
            |INTERNET_FLAG_PRAGMA_NOCACHE    // do not try the cache or proxy
            |INTERNET_FLAG_NO_CACHE_WRITE;   // don't add this to the IE cache

hUrl = InternetOpenUrl(hInt, szURL, NULL, 0, dwRequestFlags, NULL);
if (hUrl)
{
  // <only gets here after entire download is complete>

  InternetCloseHandle(hUrl);
}
InternetCloseHandle(hInt);

文档建议发送请求,并处理标题响应(未完成下载),然后您应该通过 InternetReadFile()循环运行,直到它返回 TRUE dwNumberOfBytesRead 为0.

The documentation suggests that this sends the request, and processes the headers of the response (not completes the download), and then you are expected to run through a InternetReadFile() loop until it returns TRUE and dwNumberOfBytesRead is 0.


来自MSDN

InternetOpenUrl功能 InternetOpenUrl函数解析URL字符串,建立与服务器的连接, 准备 以下载URL标识的数据。然后,应用程序可以使用InternetReadFile [...]来检索URL数据。

From MSDN
InternetOpenUrl Function: The InternetOpenUrl function parses the URL string, establishes a connection to the server, and prepares to download the data identified by the URL. The application can then use InternetReadFile [...] to retrieve the URL data.

InternetReadFile功能
为了确保检索所有数据,应用程序必须继续调用InternetReadFile函数,直到函数返回TRUE并且lpdwNumberOfBytesRead参数等于零。

InternetReadFile Function: To ensure all data is retrieved, an application must continue to call the InternetReadFile function until the function returns TRUE and the lpdwNumberOfBytesRead parameter equals zero.

我也尝试使用异步方法,并注意到同样的事情。具体来说, INTERNET_STATUS_RESPONSE_RECEIVED 仅在下载完成后才会发送到已注册的回调方法。这意味着我的客户端只能在下载完成后才能开始访问数据。

I've tried this using the asynchronous method too, and noticed the same thing. Specifically, the INTERNET_STATUS_RESPONSE_RECEIVED is only sent to the registered callback method after download is complete. Which means my client is only able to start accessing the data after the download has completed.

类似地,我实现了一个使用WinHttp库的版本,并且注意到完全相同的结果。

In a similar vein, I implemented a version that uses the WinHttp library too, and noticed exactly the same results.

这在超时时会让事情变得棘手。如果下载超过超时(默认值为30秒), InternetOpenUrl()将失败。

This makes things tricky when it comes to timeouts. If the download exceeds the timeout (default of 30 seconds by the looks of it), InternetOpenUrl() fails.

所以我有两个问题:

我知道提供这种功能,因为你并不总是想分配150MB的内存块,但提供的借口是你不知道有多少数据可用......但是WinINET已经有了完成下载。

为什么要让它看起来非常像 recv()方法如果它只是一个临时文件的抽象,或IE缓存中的文件(或更糟糕的,浪费的内存块)?

And why make it look remarkably like the recv() method wrapped up if its just an abstraction over a temporary file, or file in the IE cache (or worse, a wasted block of memory)?

我该怎么设置超时长到?如果我从来不知道数据在超时之前有多大,那么我该如何决定将超时值设置为什么?

And what should I be setting the timeout length to? If I never know how big the data is before its timed-out, then how do I decide what to set the timeout value to?

在慢速连接或大文件上,可以想象在整个下载完成之前可以对数据进行大量工作。在HTTP的经典Berkley套接字重新实现中,循环遍历 recv()调用会在数据发生时向我提供数据,这最终是我需要的。

On a slow connection or with a large file, it is very conceivable that a lot of work can be done on the data before the entire download is completed. In a classic Berkley socket re-implementation of HTTP, looping through the recv() call would provide me with the data as it comes down, which is ultimately what I need.

是的我可以使用简单的套接字重写一个实现,但我宁愿不必浪费时间来支持整个HTTP规范和SSL加密,更不用说代理了在WinINET中支持。

Yes I could re-write an implementation using simple sockets, but I would rather not have to waste time on supporting the entire HTTP spec and SSL encryption, not to mention the proxy support in WinINET.

推荐答案

我知道回答你自己的问题可能不礼貌,但我相信我追查到了什么问题是。

I know its probably not polite to answer your own question, but I believe I tracked down what the problem was.

重新启动后(自动更新中浪费了很多很多很多分钟)我再次尝试,并遇到了同样的问题,但是我接受了Alex K.并且JJ的评论表明这不是预期的行为,并开始调查可能会干扰的机器上运行的软件。

After a reboot (and many, many, many minutes wasted on Automatic Updates) I tried again, and experienced the same problem, but I took solice from Alex K. and J.J.'s comments suggesting this is not the expected behavior, and started investigating software running on the machine that might interfere.

申请被终止,许多服务被关闭,我我真的希望不会产生这种影响的服务,但是确实如此。

After many applications were terminated, and many services were turned off, I stumbled across one service that I really hoped wouldn't have this kind of effect, however it did.

我关闭了卡巴斯基实验室网络代理,嘿-presto ,InternetOpenUrl在下载HTTP响应开始后大约返回2秒。我会立即更喜欢,但75秒下载中的第二或第二次至少会给WinINET时间来处理标题并进行它可能需要的任何预处理。

I turned off "Kaspersky Lab Network Agent", and hey-presto, InternetOpenUrl returned about 2 seconds after download of the HTTP response started. I would have preferred immediately, but a second or two of a 75 second download at least gives WinINET time to process headers and do whatever pre-processing it might need to.

事实证明,如果我不从InternetReadFile()读取数据,下载永远不会完成(通过Charles看到),暗示(希望)InternetReadFile()确实是recv()调用的包装器(就像我一样)可能已预料到。

It also turned out that if I don't read the data from InternetReadFile(), the download never completes (as seen via Charles), implying (hopefully) that InternetReadFile() is a wrapper around the recv() call indeed (as I would have expected).

网络代理服务的连续重新启用和禁用验证了此发现。我想以某种方式最终证明(或反驳)这一点。

Successive re-enabling and disabling of the Network Agent Service validated this finding. I would like to somehow conclusively prove (or disprove) this.

事实证明,我的(读取:IT安全部门)选择反病毒及其拦截 - 全部 - 网络层通信保护似乎是导致问题的原因。

So it turns out, my (read: IT Security Department's) choice of anti virus and its intercept-all-network-layer-communications protection appears to have been the cause of the problem.

这篇关于InternetOpenUrl仅在下载整个HTTP响应后返回的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆