非阻塞 I/O 真的比多线程阻塞 I/O 快吗?如何? [英] Is non-blocking I/O really faster than multi-threaded blocking I/O? How?

查看:27
本文介绍了非阻塞 I/O 真的比多线程阻塞 I/O 快吗?如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在网上搜索了一些关于阻塞 I/O 和非阻塞 I/O 的技术细节,我发现有几个人说非阻塞 I/O 会比阻塞 I/O 更快.例如在 这份文件.

I searched the web on some technical details about blocking I/O and non blocking I/O and I found several people stating that non-blocking I/O would be faster than blocking I/O. For example in this document.

如果我使用阻塞I/O,那么当前阻塞的线程当然不能做任何其他事情...因为它被阻塞了.但是一旦一个线程开始被阻塞,操作系统就可以切换到另一个线程,并且在被阻塞的线程有事情要做之前不会切换回来.所以只要系统上还有另一个线程需要 CPU 并且没有被阻塞,与基于事件的非阻塞方法相比,CPU 空闲时间应该不会更多,是吗?

If I use blocking I/O, then of course the thread that is currently blocked can't do anything else... Because it's blocked. But as soon as a thread starts being blocked, the OS can switch to another thread and not switch back until there is something to do for the blocked thread. So as long as there is another thread on the system that needs CPU and is not blocked, there should not be any more CPU idle time compared to an event based non-blocking approach, is there?

除了减少 CPU 空闲时间之外,我还看到了增加计算机在给定时间范围内可以执行的任务数量的另一种选择:减少由切换线程引入的开销.但是怎么做呢?开销是否足够大以显示可衡量的效果?这是我如何想象它工作的想法:

Besides reducing the time the CPU is idle I see one more option to increase the number of tasks a computer can perform in a given time frame: Reduce the overhead introduced by switching threads. But how can this be done? And is the overhead large enough to show measurable effects? Here is an idea on how I can picture it working:

  1. 要加载文件的内容,应用程序将此任务委托给基于事件的 I/O 框架,并传递回调函数和文件名
  2. 事件框架委托给操作系统,它对硬盘的 DMA 控制器进行编程以将文件直接写入内存
  3. 事件框架允许运行更多代码.
  4. 磁盘到内存复制完成后,DMA 控制器会导致中断.
  5. 操作系统的中断处理程序会通知基于事件的 I/O 框架文件已完全加载到内存中.它是如何做到的?使用信号??
  6. 当前在事件 I/O 框架中运行的代码完成.
  7. 基于事件的 I/O 框架检查其队列并查看来自第 5 步的操作系统消息并执行它在第 1 步中获得的回调.

它是这样工作的吗?如果没有,它是如何工作的?这意味着事件系统可以在不需要显式接触堆栈的情况下工作(例如,真正的调度程序需要备份堆栈并将另一个线程的堆栈复制到内存中,同时切换线程)?这实际上节省了多少时间?还有更多吗?

Is that how it works? If it does not, how does it work? That means that the event system can work without ever having the need to explicitly touch the stack (such as a real scheduler that would need to backup the stack and copy the stack of another thread into memory while switching threads)? How much time does this actually save? Is there more to it?

推荐答案

非阻塞或异步 I/O 的最大优点是您的线程可以继续并行工作.当然,您也可以使用附加线程来实现这一点.正如您所说的以获得最佳整体(系统)性能,我想最好使用异步 I/O 而不是多线程(因此减少线程切换).

The biggest advantage of nonblocking or asynchronous I/O is that your thread can continue its work in parallel. Of course you can achieve this also using an additional thread. As you stated for best overall (system) performance I guess it would be better to use asynchronous I/O and not multiple threads (so reducing thread switching).

让我们看看一个网络服务器程序的可能实现,它可以处理 1000 个并行连接的客户端:

Let's look at possible implementations of a network server program that shall handle 1000 clients connected in parallel:

  1. 每个连接一个线程(可以是阻塞 I/O,也可以是非阻塞 I/O).
    每个线程都需要内存资源(也是内核内存!),这是一个缺点.每增加一个线程都意味着调度程序需要做更多的工作.
  2. 一个线程用于所有连接.
    这会从系统中获取负载,因为我们的线程较少.但这也会使您无法充分利用机器的性能,因为您最终可能会将一个处理器驱动到 100%,而让所有其他处理器闲置.
  3. 几个线程,每个线程处理一些连接.
    这会从系统中加载负载,因为线程较少.它可以使用所有可用的处理器.在 Windows 上,此方法由 Thread 支持池 API.

当然,拥有更多线程本身并不是问题.正如您可能已经认识到的那样,我选择了相当多的连接/线程.如果我们只讨论十几个线程(这也是 Raymond Chen 在 MSDN 博客文章 Windows 是否有每个进程 2000 个线程的限制?).

Of course having more threads is not per se a problem. As you might have recognized I chose quite a high number of connections/threads. I doubt that you'll see any difference between the three possible implementations if we are talking about only a dozen threads (this is also what Raymond Chen suggests on the MSDN blog post Does Windows have a limit of 2000 threads per process?).

在 Windows 上使用 无缓冲文件I/O 意味着写入的大小必须是页面大小的倍数.我还没有测试过,但听起来这也会对缓冲的同步和异步写入的写入性能产生积极的影响.

On Windows using unbuffered file I/O means that writes must be of a size which is a multiple of the page size. I have not tested it, but it sounds like this could also affect write performance positively for buffered synchronous and asynchronous writes.

您描述的第 1 步到第 7 步很好地说明了它的工作原理.在 Windows 上,操作系统将使用事件或回调通知您异步 I/O(WriteFileOVERLAPPED 结构)的完成.仅当您的代码调用 WaitForMultipleObjectsExbAlertable 设置为 true 时,才会调用回调函数.

The steps 1 to 7 you describe give a good idea of how it works. On Windows the operating system will inform you about completion of an asynchronous I/O (WriteFile with OVERLAPPED structure) using an event or a callback. Callback functions will only be called for example when your code calls WaitForMultipleObjectsEx with bAlertable set to true.

在网络上阅读更多内容:

Some more reading on the web:

  • Multiple Threads in the User Interface on MSDN, also shortly handling the cost of creating threads
  • Section Threads and Thread Pools says "Although threads are relatively easy to create and use, the operating system allocates a significant amount of time and other resources to manage them."
  • CreateThread documentation on MSDN says "However, your application will have better performance if you create one thread per processor and build queues of requests for which the application maintains the context information.".
  • Old article Why Too Many Threads Hurts Performance, and What to do About It

这篇关于非阻塞 I/O 真的比多线程阻塞 I/O 快吗?如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆