epoll、poll、threadpool 有什么区别? [英] What's the difference between epoll, poll, threadpool?

查看:23
本文介绍了epoll、poll、threadpool 有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谁能解释一下epollpoll和线程池的区别?

Could someone explain what the difference is between epoll, poll and threadpool?

  • 优缺点是什么?
  • 对框架有什么建议吗?
  • 对简单/基础教程有什么建议吗?
  • 似乎 epollpoll 是 Linux 特定的...是否有 Windows 的等效替代方案?
  • What are the pros / cons?
  • Any suggestions for frameworks?
  • Any suggestions for simple/basic tutorials?
  • It seems that epoll and poll are Linux-specific... Is there an equivalent alternative for Windows?

推荐答案

线程池与 poll 和 epoll 并不真正属于同一类别,因此我假设您将线程池称为线程池处理多个连接的线程池"每个连接的线程".

Threadpool does not really fit into the same category as poll and epoll, so I will assume you are referring to threadpool as in "threadpool to handle many connections with one thread per connection".

  • 线程池
    • 对于中小型并发来说相当高效,甚至可以超越其他技术.
    • 利用多个内核.
    • 即使某些系统(例如 Linux)原则上可以很好地调度 100,000 个线程,但其扩展性也不能超过数百个".
    • 幼稚的实现表现出thundering herd"问题.
    • 除了上下文切换和雷鸣般的羊群之外,还必须考虑内存.每个线程都有一个堆栈(通常至少为 1 兆字节).因此,一千个线程仅占用 1 GB 的 RAM 用于堆栈.即使未提交该内存,它仍会在 32 位操作系统下占用大量地址空间(在 64 位下不是真正的问题).
    • 线程实际上可以使用epoll,虽然显而易见的方法(所有线程都阻塞在epoll_wait上)是没有用的,因为epoll会唤醒每个线程都在等待它,所以它仍然会有同样的问题.
      • 最佳解决方案:单线程侦听 epoll,执行输入多路复用,并将完整请求交给线程池.
      • futex 是你的朋友,结合例如每个线程一个快进队列.尽管记录不完整且笨拙,futex 提供了完全需要的东西.epoll 可能一次返回多个事件,而 futex 让您可以高效且精确地控制一次唤醒 N 个阻塞的线程(Nmin(num_cpu, num_events) 理想情况下),并且在最好的情况下它根本不涉及额外的系统调用/上下文切换.
      • 实施起来并不容易,需要小心.
      • threadpool
        • Reasonably efficient for small and medium concurrency, can even outperform other techniques.
        • Makes use of multiple cores.
        • Does not scale well beyond "several hundreds" even though some systems (e.g. Linux) can in principle schedule 100,000s of threads just fine.
        • Naive implementation exhibits "thundering herd" problem.
        • Apart from context switching and thundering herd, one must consider memory. Each thread has a stack (typically at least a megabyte). A thousand threads therefore take a gigabyte of RAM just for stack. Even if that memory is not committed, it still takes away considerable address space under a 32 bit OS (not really an issue under 64 bits).
        • Threads can actually use epoll, though the obvious way (all threads block on epoll_wait) is of no use, because epoll will wake up every thread waiting on it, so it will still have the same issues.
          • Optimal solution: single thread listens on epoll, does the input multiplexing, and hands complete requests to a threadpool.
          • futex is your friend here, in combination with e.g. a fast forward queue per thread. Although badly documented and unwieldy, futex offers exactly what's needed. epoll may return several events at a time, and futex lets you efficiently and in a precisely controlled manner wake N blocked threads at a time (N being min(num_cpu, num_events) ideally), and in the best case it does not involve an extra syscall/context switch at all.
          • Not trivial to implement, takes some care.
          • 对于中小型并发来说相当高效.
          • 无法扩展到几百个"以上.
          • 上下文切换要贵得多(不同的地址空间!).
          • 在 fork 成本更高的旧系统(所有页面的深度复制)上的扩展性更差.即使在现代系统上,fork 也不是免费的",尽管开销主要由写时复制机制合并.在也被修改的大型数据集上,fork 之后出现的大量页面错误可能会对性能产生负面影响.
          • 但是,经过 30 多年的实践证明,它可以可靠地工作.
          • 实施起来极其简单且坚如磐石:如果任何流程崩溃,世界不会结束.您(几乎)没有什么可以做错的.
          • 很容易出现雷霆万钧".
          • Reasonably efficient for small and medium concurrency.
          • Does not scale well beyond "few hundreds".
          • Context switches are much more expensive (different address spaces!).
          • Scales significantly worse on older systems where fork is much more expensive (deep copy of all pages). Even on modern systems fork is not "free", although the overhead is mostly coalesced by the copy-on-write mechanism. On large datasets which are also modified, a considerable number of page faults following fork may negatively impact performance.
          • However, proven to work reliably for over 30 years.
          • Ridiculously easy to implement and rock solid: If any of the processes crash, the world does not end. There is (almost) nothing you can do wrong.
          • Very prone to "thundering herd".
          • 大致相同的两种风格(BSD 与 System V).
          • 有些陈旧和缓慢,使用起来有些尴尬,但几乎没有不支持它们的平台.
          • 等待一组描述符上的某事发生"
            • 允许一个线程/进程一次处理多个请求.
            • 无多核使用.
            • 仅限 Linux.
            • 昂贵的修改与高效等待的概念:
              • 在添加描述符时将有关描述符的信息复制到内核空间 (epoll_ctl)
                • 这通常很少发生.
                • 这通常经常发生.
                • 因此,Descriptor 知道谁在听,并在适当的时候直接向服务员发出信号,而不是服务员搜索描述符列表
                • poll 工作原理的相反方式
                • O(1) 与在描述符数量方面的小 k(非常快),而不是 O(n)
                • 大多数描述符大部分时间都处于空闲状态,很少有事情(例如收到数据"、连接关闭")实际发生在少数描述符上.
                • 大多数情况下,您不想从集合中添加/删除描述符.
                • 大多数时候,您都在等待某事发生.
                • 级别触发的 epoll 会唤醒所有等待它的线程(这是按预期工作"),因此将 epoll 与线程池一起使用的幼稚方法是无用的.至少对于 TCP 服务器来说,这不是什么大问题,因为无论如何都必须先组装部分请求,所以简单的多线程实现不会这样做.
                • 在文件读/写(始终准备就绪")方面无法正常工作.
                • 直到最近才可以与 AIO 一起使用,现在可以通过 eventfd 使用,但需要一个(迄今为止)未记录的函数.
                • 如果上述假设为真,epoll 可能效率低下,poll 的性能可能相同或更好.
                • epoll 不能做魔术",也就是说,对于发生的事件的数量,它仍然需要 O(N).
                • 但是,epoll 与新的 recvmmsg 系统调用配合得很好,因为它一次返回多个就绪通知(尽可能多,取决于您指定的任何内容)maxevents).这使得可以接收例如在繁忙的服务器上使用一个系统调用发出 15 个 EPOLLIN 通知,并使用第二个系统调用读取相应的 15 条消息(系统调用减少了 93%!).不幸的是,对一个 recvmmsg 调用的所有操作都引用同一个套接字,因此它主要用于基于 UDP 的服务(对于 TCP,必须有一种 recvmmsmsgsyscall 每个项目也需要一个套接字描述符!)
                • 描述符应该总是设置为非阻塞,并且即使在使用epoll时也应该检查EAGAIN,因为在某些特殊情况下epoll 报告就绪,随后的读取(或写入)将仍然阻塞.某些内核上的 poll/select 也是这种情况(尽管它可能已被修复).
                • 使用 naive 实现,可能会导致发送缓慢的发件人饿死.当盲目阅读直到收到通知返回 EAGAIN 时,有可能无限期地从快速发送方读取新传入的数据,同时完全饿死慢速发送方(只要数据保持足够快,你可能会在很长一段时间内看不到 EAGAIN!).以相同的方式应用于 poll/select.
                • 边缘触发模式在某些情况下有一些怪癖和意外行为,因为文档(手册页和 TLPI)含糊不清(可能"、应该"、可能"),有时会对其操作产生误导.
                  该文档指出,等待一个 epoll 的多个线程都已发出信号.它进一步指出,一个通知会告诉您自上次调用 epoll_wait 以来是否发生了 IO 活动(或者自描述符打开以来,如果之前没有调用).
                  边缘触发模式下真实的、可观察的行为更接近于唤醒第一个调用epoll_wait的线程,表明自任何人 最后调用 either epoll_wait 对描述符的读/写函数,此后只向下一个报告就绪状态线程调用或已在 epoll_wait 中阻塞,用于在任何人 调用描述符上的读(或写)函数之后发生的任何操作".这也有点道理……这与文档所建议的不完全相同.
                • A level-triggered epoll wakes all threads waiting on it (this is "works as intended"), therefore the naive way of using epoll with a threadpool is useless. At least for a TCP server, it is no big issue since partial requests would have to be assembled first anyway, so a naive multithreaded implementation won't do either way.
                • Does not work as one would expect with file read/writes ("always ready").
                • Could not be used with AIO until recently, now possible via eventfd, but requires a (to date) undocumented function.
                • If the above assumptions are not true, epoll can be inefficient, and poll may perform equally or better.
                • epoll cannot do "magic", i.e. it is still necessarily O(N) in respect to the number of events that occur.
                • However, epoll plays well with the new recvmmsg syscall, since it returns several readiness notifications at a time (as many as are available, up to whatever you specify as maxevents). This makes it possible to receive e.g. 15 EPOLLIN notifications with one syscall on a busy server, and read the corresponding 15 messages with a second syscall (a 93% reduction in syscalls!). Unluckily, all operations on one recvmmsg invokation refer to the same socket, so it is mostly useful for UDP based services (for TCP, there would have to be a kind of recvmmsmsg syscall which also takes a socket descriptor per item!).
                • Descriptors should always be set to nonblocking and one should check for EAGAIN even when using epoll because there are exceptional situations where epoll reports readiness and a subsequent read (or write) will still block. This is also the case for poll/select on some kernels (though it has presumably been fixed).
                • With a naive implementation, starvation of slow senders is possible. When blindly reading until EAGAIN is returned upon receiving a notification, it is possible to indefinitely read new incoming data from a fast sender while completely starving a slow sender (as long as data keeps coming in fast enough, you might not see EAGAIN for quite a while!). Applies to poll/select in the same manner.
                • Edge-triggered mode has some quirks and unexpected behaviour in some situations, since the documentation (both man pages and TLPI) is vague ("probably", "should", "might") and sometimes misleading about its operation.
                  The documentation states that several threads waiting on one epoll are all signalled. It further states that a notification tells you whether IO activity has happened since the last call to epoll_wait (or since the descriptor was opened, if there was no previous call).
                  The true, observable behaviour in edge-triggered mode is much closer to "wakes the first thread that has called epoll_wait, signalling that IO activity has happened since anyone last called either epoll_wait or a read/write function on the descriptor, and thereafter only reports readiness again to the next thread calling or already blocked in epoll_wait, for any operations happening after anyone called a of read (or write) function on the descriptor". It kind of makes sense, too... it just isn't exactly what the documentation suggests.
                • BSD 类似于 epoll,用法不同,效果相似.
                • 也适用于 Mac OS X
                • 传闻速度更快(我从未使用过它,所以不知道这是不是真的).​​
                • 在单个系统调用中注册事件并返回结果集.
                • BSD analogon to epoll, different usage, similar effect.
                • Also works on Mac OS X
                • Rumoured to be faster (I've never used it, so cannot tell if that is true).
                • Registers events and returns a result set in a single syscall.
                • 适用于 Windows 的 Epoll,或者更确切地说是类固醇的 epoll.
                • 以某种方式(套接字、可等待定时器、文件操作、线程、进程)与所有可等待或可报警的事物无缝协作
                • 如果微软在 Windows 中做对了一件事,那就是完成端口:
                  • 使用任意数量的线程开箱即可无忧
                  • 没有雷鸣般的羊群
                  • 按照 LIFO 顺序一个一个唤醒线程
                  • 保持缓存温暖并最大限度地减少上下文切换
                  • 尊重机器上的处理器数量或交付所需数量的工人

                  libevent -- 2.0 版本也支持 Windows 下的完成端口.

                  libevent -- The 2.0 version also supports completion ports under Windows.

                  ASIO -- 如果你在你的项目中使用了 Boost,那就别无所求:你已经有了它作为boost-asio.

                  ASIO -- If you use Boost in your project, look no further: You already have this available as boost-asio.

                  上面列出的框架带有大量文档.Linux docs 和 MSDN 解释了 epoll和完成端口广泛.

                  The frameworks listed above come with extensive documentation. The Linux docs and MSDN explains epoll and completion ports extensively.

                  epoll 使用小教程:

                  Mini-tutorial for using epoll:

                  int my_epoll = epoll_create(0);  // argument is ignored nowadays
                  
                  epoll_event e;
                  e.fd = some_socket_fd; // this can in fact be anything you like
                  
                  epoll_ctl(my_epoll, EPOLL_CTL_ADD, some_socket_fd, &e);
                  
                  ...
                  epoll_event evt[10]; // or whatever number
                  for(...)
                      if((num = epoll_wait(my_epoll, evt, 10, -1)) > 0)
                          do_something();
                  

                  IO 完成端口的迷你教程(注意使用不同的参数调用两次 CreateIoCompletionPort):

                  Mini-tutorial for IO completion ports (note calling CreateIoCompletionPort twice with different parameters):

                  HANDLE iocp = CreateIoCompletionPort(INVALID_HANDLE_VALUE, 0, 0, 0); // equals epoll_create
                  CreateIoCompletionPort(mySocketHandle, iocp, 0, 0); // equals epoll_ctl(EPOLL_CTL_ADD)
                  
                  OVERLAPPED o;
                  for(...)
                      if(GetQueuedCompletionStatus(iocp, &number_bytes, &key, &o, INFINITE)) // equals epoll_wait()
                          do_something();
                  

                  (这些迷你图省略了所有类型的错误检查,希望我没有打错任何字,但它们在大多数情况下应该可以给你一些想法.)

                  (These mini-tuts omit all kind of error checking, and hopefully I didn't make any typos, but they should for the most part be ok to give you some idea.)


                  请注意,完成端口(Windows)在概念上与 epoll(或 kqueue)相反.顾名思义,它们表示完成,而不是准备.也就是说,您触发了一个异步请求并忘记它,直到一段时间后您被告知它已完成(成功或不那么成功,也有立即完成"的特殊情况).
                  使用 epoll,您会阻塞,直到您收到某些数据"(可能只有一个字节)已经到达并且可用或者有足够的缓冲区空间以便您可以在不阻塞的情况下执行写操作的通知.只有这样,您才开始实际操作,然后希望它不会阻塞(除了您所期望的,没有严格的保证——因此将描述符设置为非阻塞并检查 EAGAIN [EAGAIN EWOULDBLOCK 用于套接字,因为很高兴,标准允许两个不同的错误值]).


                  Note that completion ports (Windows) conceptually work the other way around as epoll (or kqueue). They signal, as their name suggests, completion, not readiness. That is, you fire off an asynchronous request and forget about it until some time later you're told that it has completed (either successfully nor not so much successfully, and there is the exceptional case of "completed immediately" too).
                  With epoll, you block until you are notified that either "some data" (possibly as little as one byte) has arrived and is available or there is sufficient buffer space so you can do a write operation without blocking. Only then, you start the actual operation, which then will hopefully not block (other than you would expect, there is no strict guarantee for that -- it is therefore a good idea to set descriptors to nonblocking and check for EAGAIN [EAGAIN and EWOULDBLOCK for sockets, because oh joy, the standard allows for two different error values]).

                  这篇关于epoll、poll、threadpool 有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆