如何优雅地中断 urllib2 下载? [英] How do I gracefully interrupt urllib2 downloads?

查看:47
本文介绍了如何优雅地中断 urllib2 下载?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 urllib2build_opener() 创建一个 OpenerDirector.我正在使用 OpenerDirector 来获取一个缓慢的页面,因此它有一个很大的超时时间.

I am using urllib2's build_opener() to create an OpenerDirector. I am using the OpenerDirector to fetch a slow page and so it has a large timeout.

到目前为止,一切都很好.

So far, so good.

但是,在另一个线程中,我被告知要中止下载 - 假设用户已选择退出 GUI 中的程序.

However, in another thread, I have been told to abort the download - let's say the user has selected to exit the program in the GUI.

有没有办法通知 urllib2 下载应该退出?

Is there a way to signal an urllib2 download should quit?

推荐答案

没有明确的答案.有几个丑陋的.

最初,我在问题中提出了被拒绝的想法.由于很明显没有正确的答案,我决定将各种次优替代方案作为列表答案发布.其中一些灵感来自评论,谢谢.

理想的解决方案是 OpenerDirector 提供取消运算符.

An ideal solution would be if OpenerDirector offered a cancel operator.

它没有.库作者注意:如果您提供长时间缓慢的操作,并且人们要在实际应用程序中使用它们,您需要提供一种取消它们的方法.

It does not. Library writers take note: if you provide long slow operations, you need to provide a way to cancel them if people are to use them in real-world applications.

作为其他人的通用解决方案,这可能会奏效.超时越小,对环境变化的响应就越快.但是,如果在超时时间内没有完全完成下载,它也会导致下载失败,因此这是一种权衡.在我的情况下,这是站不住脚的.

As a general solution for others, this may work. With a smaller timeout, it would be more responsive to the changes in circumstances. However, it will also cause downloads to fail if they weren't completely finished in the timeout time, so this is a trade-off. In my situation, it is untenable.

同样,作为通用解决方案,这可能会奏效.如果下载包含非常大的文件,您可以以小块读取它们,然后中止读取一个块.

Again, as a general solution, this may work. If the download consists of very large files, you can read them in small chunks, and abort after a chunk is read.

不幸的是,如果(如我的情况)延迟是接收第一个字节,而不是文件的大小,这将无济于事.

Unfortunately, if (as in my case) the delay is in receiving the first byte, rather than the size of the file, this will not help.

虽然有一些激进的技术可以杀死线程,具体取决于操作系统,不推荐使用.特别是,它们可能导致死锁发生.参见 Eli Bendersky 的 两篇 文章(通过@JBernardo).

While there are some aggressive techniques to kill threads, depending on the operating system, they are not recommended. In particular, they can cause deadlocks to occur. See Eli Bendersky's two articles (via @JBernardo).

如果中止操作是由用户触发的,最简单的方法是不响应,直到打开操作完成才对请求采取行动.

If the abort operation has been triggered by the user, it may be simplest to just be unresponsive, and not act on the request until the open operation has completed.

您的用户是否可以接受这种无响应(提示:不!),取决于您的项目.

Whether this unresponsiveness is acceptable to your users (hint: no!), is up to your project.

它还会继续对服务器提出要求,即使结果已知是不需要的.

It also continues to place a demand on the server, even if the result is known to be unneeded.

如果您创建一个单独的线程来运行该操作,然后以可中断的方式与该线程通信,您可以丢弃被阻塞的线程,并开始处理下一个操作.最终,线程将解除阻塞,然后可以正常关闭.

If you create a separate thread to run the operation, and then communicate with that thread in an interruptable manner, you could discard the blocked thread, and start working on the next operation instead. Eventually, the thread will unblock and then it can gracefully shut-down.

线程应该是一个守护进程,所以它不会阻止应用程序的完全关闭.

The thread should be a daemon, so it doesn't block the total shut-down of the application.

这将给予用户响应能力,但这意味着需要继续支持它的服务器,即使结果是不需要的.

This will give the user responsiveness, but it means that the server that will need to continue to support it, even though the result is not needed.

如@Luke 的答案 所述,可以为标准 Python 库.

As described in @Luke's answer, it may be possible to provide (fragile?, unportable?) extensions to the standard Python libraries.

他的解决方案将套接字操作从阻塞更改为轮询.另一个可能允许通过 socket.shutdown() 方法关闭(如果这确实会中断阻塞的套接字 - 未测试.)

His solution changes the socket operations from blocking to polling. Another might allow shutdown through the socket.shutdown() method (if that, indeed, will interrupt a blocked socket - not tested.)

基于 Twisted 的解决方案可能更简洁.见下文.

A solution based on Twisted may be cleaner. See below.

Twisted 框架为事件驱动的网络操作提供了一组替代库.我理解这意味着所有不同的通信都可以由单线程处理而不会阻塞.

The Twisted framework provides a replacement set of libraries for network operations that are event-driven. I understand this means that all of the different communications can be handled by a single-thread with no blocking.

可能可以导航 OpenerDirector,找到阻塞的底层套接字,并直接破坏它(socket.shutdown() 是否就足够了?)让它回归.

It may be possible to navigate the OpenerDirector, to find the baselevel socket that is blocking, and sabotage it directly (Will socket.shutdown() be sufficient?) to make it return.

糟糕.

读取socket的线程可以移动到一个单独的进程中,可以使用进程间通信来传输结果.这个IPC可以被客户端提前中止,然后整个进程就可以被杀死了.

The thread that reads the socket can be moved into a separate process, and interprocess communication can be used to transmit the result. This IPC can be aborted early by the client, and then the whole process can be killed.

如果您可以控制正在读取的网络服务器,则可以向其发送单独的消息,要求它关闭套接字.这应该会导致被阻止的客户端做出反应.

If you have control over the web-server being read, it could be sent a separate message asking it to close the socket. That should cause the blocked client to react.

这篇关于如何优雅地中断 urllib2 下载?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆