Twisted 中 select/poll 与 epoll 反应器的注意事项 [英] Caveats of select/poll vs. epoll reactors in Twisted

查看:14
本文介绍了Twisted 中 select/poll 与 epoll 反应器的注意事项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读和体验的一切(基于 Tornado 的应用程序)让我相信 ePoll 是基于 Select 和 Poll 的网络的自然替代品,尤其是 Twisted.这让我很偏执,一种更好的技术或方法不附带价格,这是非常罕见的.

Everything I've read and experienced ( Tornado based apps ) leads me to believe that ePoll is a natural replacement for Select and Poll based networking, especially with Twisted. Which makes me paranoid, its pretty rare for a better technique or methodology not to come with a price.

阅读 epoll 和替代品之间的几十个比较表明,epoll 显然是速度和可扩展性的冠军,特别是它以线性方式扩展,这非常棒.也就是说,处理器和内存利用率如何,epoll 仍然是冠军吗?

Reading a couple dozen comparisons between epoll and alternatives shows that epoll is clearly the champion for speed and scalability, specifically that it scales in a linear fashion which is fantastic. That said, what about processor and memory utilization, is epoll still the champ?

推荐答案

对于非常少的套接字(当然,取决于您的硬件,但我们讨论的是 10 个或更少的数量级),选择可以在内存使用和运行速度方面击败 epoll.当然,对于如此少量的套接字,两种机制都非常快,以至于在绝大多数情况下您并不真正关心这种差异.

For very small numbers of sockets (varies depending on your hardware, of course, but we're talking about something on the order of 10 or fewer), select can beat epoll in memory usage and runtime speed. Of course, for such small numbers of sockets, both mechanisms are so fast that you don't really care about this difference in the vast majority of cases.

不过,有一个澄清.select 和 epoll 都是线性缩放的.但是,一个很大的区别是面向用户空间的 API 具有基于不同事物的复杂性.select 调用的成本大致与您传递给它的编号最高的文件描述符的值有关.如果您选择单个 fd,100,那么这大约是选择单个 fd,50 的成本的两倍.在最高值以下添加更多 fd 并不是完全免费的,因此在实践中比这更复杂一点,但是这个对于大多数实现来说,这是一个很好的第一个近似值.

One clarification, though. Both select and epoll scale linearly. A big difference, though, is that the userspace-facing APIs have complexities that are based on different things. The cost of a select call goes roughly with the value of the highest numbered file descriptor you pass it. If you select on a single fd, 100, then that's roughly twice as expensive as selecting on a single fd, 50. Adding more fds below the highest isn't quite free, so it's a little more complicated than this in practice, but this is a good first approximation for most implementations.

epoll 的成本更接近于实际具有事件的文件描述符的数量.如果您正在监视 200 个文件描述符,但其中只有 100 个具有事件,那么您(非常粗略地)只需为这 100 个活动文件描述符付费.这是 epoll 倾向于提供其优于 select 的主要优势之一的地方.如果您有 1000 个大部分闲置的客户端,那么当您使用 select 时,您仍然需要为所有 1000 个客户付费.但是,使用 epoll 时,就好像您只有几个 - 您只需为在任何给定时间处于活动状态的那些付费.

The cost of epoll is closer to the number of file descriptors that actually have events on them. If you're monitoring 200 file descriptors, but only 100 of them have events on them, then you're (very roughly) only paying for those 100 active file descriptors. This is where epoll tends to offer one of its major advantages over select. If you have a thousand clients that are mostly idle, then when you use select you're still paying for all one thousand of them. However, with epoll, it's like you've only got a few - you're only paying for the ones that are active at any given time.

所有这一切都意味着 epoll 将减少大多数工作负载的 CPU 使用率.就内存使用而言,它有点折腾.select 确实设法以高度紧凑的方式(每个文件描述符一位)表示所有必要的信息.FD_SETSIZE(通常为 1024)限制了您可以与 select 一起使用的文件描述符的数量,这意味着对于可以与 一起使用的三个 fd 集合中的每一个,您永远不会花费超过 128 个字节选择(读、写、异常).与最大 384 字节相比,epoll 有点像猪.每个文件描述符由一个多字节结构表示.但是,绝对而言,它仍然不会使用太多内存.您可以用几十 KB 表示大量的文件描述符(我认为大约每 1000 个文件描述符 20k).如果您只想监视一个文件描述符但它的值恰好是 1024,那么您还必须使用 select 使用所有 384 个字节,而使用 epoll仅花费 20 个字节.尽管如此,所有这些数字都非常小,所以没有太大区别.

All this means that epoll will lead to less CPU usage for most workloads. As far as memory usage goes, it's a bit of a toss up. select does manage to represent all the necessary information in a highly compact way (one bit per file descriptor). And the FD_SETSIZE (typically 1024) limitation on how many file descriptors you can use with select means that you'll never spend more than 128 bytes for each of the three fd sets you can use with select (read, write, exception). Compared to those 384 bytes max, epoll is sort of a pig. Each file descriptor is represented by a multi-byte structure. However, in absolute terms, it's still not going to use much memory. You can represent a huge number of file descriptors in a few dozen kilobytes (roughly 20k per 1000 file descriptors, I think). And you can also throw in the fact that you have to spend all 384 of those bytes with select if you only want to monitor one file descriptor but its value happens to be 1024, wheras with epoll you'd only spend 20 bytes. Still, all these numbers are pretty small, so it doesn't make much difference.

而且 epoll 的另一个好处是,您可能已经知道了,它不仅限于 FD_SETSIZE 文件描述符.您可以使用它来监视尽可能多的文件描述符.如果你只有一个文件描述符,但它的值大于 FD_SETSIZE,epoll 也可以使用,但 select 不行.

And there's also that other benefit of epoll, which perhaps you're already aware of, that it is not limited to FD_SETSIZE file descriptors. You can use it to monitor as many file descriptors as you have. And if you only have one file descriptor, but its value is greater than FD_SETSIZE, epoll works with that too, but select does not.

随机地,我最近还发现了 epollselectpoll 相比的一个小缺点.虽然这三个 API 都不支持普通文件(即文件系统上的文件),但 selectpoll 表示缺乏支持,因为报告此类描述符始终可读且始终可写.这使得它们不适合任何有意义的非阻塞文件系统 I/O,一个使用 selectpoll 并且碰巧遇到来自文件系统的文件描述符的程序将在至少继续运行(或者如果它失败,它不会是因为 selectpoll),尽管它可能没有最好的性能.

Randomly, I've also recently discovered one slight drawback to epoll as compared to select or poll. While none of these three APIs supports normal files (ie, files on a file system), select and poll present this lack of support as reporting such descriptors as always readable and always writeable. This makes them unsuitable for any meaningful kind of non-blocking filesystem I/O, a program which uses select or poll and happens to encounter a file descriptor from the filesystem will at least continue to operate (or if it fails, it won't be because of select or poll), albeit it perhaps not with the best performance.

另一方面,当被要求监视这样的文件描述符时,epoll 将快速失败并出现错误(EPERM,显然).严格来说,这几乎是不正确的.它只是以明确的方式表明它缺乏支持.通常我会为明确的故障情况鼓掌,但这个没有记录(据我所知)并导致应用程序完全损坏,而不是仅仅以潜在的性能下降的方式运行的应用程序.

On the other hand, epoll will fail fast with an error (EPERM, apparently) when asked to monitor such a file descriptor. Strictly speaking, this is hardly incorrect. It's merely signalling its lack of support in an explicit way. Normally I would applaud explicit failure conditions, but this one is undocumented (as far as I can tell) and results in a completely broken application, rather than one which merely operates with potentially degraded performance.

在实践中,我唯一见过这种情况的地方是与 stdio 交互时.用户可能会将 stdin 或 stdout 从/重定向到普通文件.以前的 stdin 和 stdout 应该是一个管道——epoll 支持得很好——然后它变成了一个普通文件,epoll 大声失败,破坏了应用程序.

In practice, the only place I've seen this come up is when interacting with stdio. A user might redirect stdin or stdout from/to a normal file. Whereas previously stdin and stdout would have been a pipe -- supported by epoll just fine -- it then becomes a normal file and epoll fails loudly, breaking the application.

这篇关于Twisted 中 select/poll 与 epoll 反应器的注意事项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆