带有大量回调的NodeJS的性能 [英] Performance of NodeJS with large amount of callbacks

查看:74
本文介绍了带有大量回调的NodeJS的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发NodeJS应用程序.有一个特定的RESTful API(GET),当由用户触发时,它需要服务器执行大约10到20个网络操作才能从不同来源获取信息.所有这些网络操作都是异步回调,一旦它们全部完成,结果将由nodejs应用程序合并并发送回客户端.所有这些操作都是通过async.map函数并行启动的.

I am working on a NodeJS application. There is a specific RESTful API (GET) that, when triggered by the user, it requires the server to do about 10-20 network operations to pull information from different sources. All these network operations are async callbacks, and once they ALL are finished, the result is consolidated by the nodejs app and sent back to the client. All these operations are started in parallel via async.map function.

我只是想了解一下,因为nodejs是单线程的,并且不使用多核计算机(至少不是没有集群),所以当节点有许多回调要处理时,节点如何扩展?回调的实际处理是否取决于节点的单线程是否空闲,还是与主线程并行处理回调?

I just want to understand, since nodejs is single threaded, and it does not make use of multi-core machines (at least not without clustering), how does node scale when it has many callbacks to process? Does the actual processing of callbacks depend on node's single thread being idle, or are callbacks processed in parallel as well as the main thread?

我问的原因是,我看到20个回调的性能从第一个回调到最后一个回调有所下降.例如,第一个网络操作(从10到20)需要141毫秒才能完成,而最后一个网络操作则需要约4秒钟(以从函数执行到函数回调返回值或一个错误).它们都是相同的网络操作,因此命中相同的数据源,因此数据源不是瓶颈.我知道一个事实,即数据源响应单个请求所花费的时间不超过200毫秒.

The reason why I ask is, I see the performance of my 20 callbacks deteriorate from the first callback to the last one. For example, the first network operation (out of the 10-20) takes 141ms to complete, whereas the last one takes about 4 seconds (measured as the time from when the function is executed, until the callback of the function returns a value or an error). They are all the same network operation hitting the same data source, so the data source is not the bottleneck). I know for a fact that the data source takes no more than 200ms to respond to a single request.

我发现了这个线程,因此它看起来我认为,一个线程需要处理所有回调和即将出现的新请求.

I found this thread, so it looks to me that the one single thread needs to address all callbacks AND new requests coming up.

所以我的问题是,对于将触发许多回调的操作,优化其性能的最佳实践是什么?

So my question is, for operations that will trigger many callbacks, what is the best practice in optimizing their performance?

推荐答案

对于网络操作,node.js实际上是单线程的.但是,一直存在一个误解,即处理I/O需要恒定的CPU资源.您问题的核心可以归结为:

For network operations node.js is effectively single threaded. However there is a persistent misunderstanding that handling I/O requires constant CPU resource. The core of your question boil down to:

回调的实际处理是否取决于节点的单线程是否空闲,还是与主线程并行处理回调?

Does the actual processing of callbacks depend on node's single thread being idle, or are callbacks processed in parallel as well as the main thread?

答案是是和不是.是的,仅在主线程空闲时才执行回调.不,线程空闲时不会执行处理".具体来说:没有处理"功能-如果处理"的含义正在等待,则节点花费零的CPU时间来处理"成千上万个回调.

The answer is yes and no. Yes, callbacks are only executed when the main thread is idle. No, the "processing" is not done when thread is idle. To be specific: there is no "processing" - it takes zero CPU time for node to "process" thousands of callbacks if what you mean by "process" is waiting.

如果我们真的需要了解节点(或浏览器)内部的工作方式,那么不幸的是,我们必须首先了解计算机的工作方式-从硬件到操作系统.是的,这将是一次深潜,请耐心等待..

If we really need to understand how node (or browser) internals work we must unfortunately first understand how computers work - from the hardware to the operating system. Yes, this is going to be a deep dive so bear with me..

这一切都始于中断的发明.

It all began with the invention of interrupts..

这是一个伟大的发明,也是一个潘多拉魔盒-Edsger Dijkstra

It was a great invention, but also a Box of Pandora - Edsger Dijkstra

是的,以上引用来自同一"Goto认为有害" Dijkstra.从一开始就将异步操作引入计算机硬件被认为是一个非常困难的话题,即使对于业内的一些传奇人物而言.

Yes, the quote above is from the same "Goto considered harmful" Dijkstra. From the very beginning introducing asynchronous operation to computer hardware was considered a very hard topic even for some of the legends in the industry.

引入了中断以加快I/O操作.硬件不需要向软件轮询某些输入(从有用的工作中节省了CPU时间),而是向CPU发送信号以告知事件已发生.然后,CPU将挂起当前正在运行的程序,并执行另一个程序来处理中断-因此我们将这些函数称为中断处理程序. "handler"一词一直在堆栈中一直停留在GUI库中,GUI库将回调函数称为事件处理程序".

Interrupts was introduced to speed up I/O operations. Rather than needing to poll some input with software (taking CPU time away from useful work) the hardware will send a signal to the CPU to tell it an event has occurred. The CPU will then suspend the currently running program and execute another program to handle the interrupt - thus we call these functions interrupt handlers. And the word "handler" has stuck all the way up the stack to GUI libraries which call callback functions "event handlers".

如果您一直在关注,您会发现中断处理程序的概念实际上是回调.您将CPU配置为在事件发生后的某个时间调用函数.因此,即使回调也不是一个新概念-它比C还要老.

If you've been paying attention you will notice that this concept of an interrupt handler is actually a callback. You configure the CPU to call a function at some later time when an event happens. So even callbacks are not a new concept - it's way older than C.

中断使现代操作系统成为可能.没有中断,CPU将无法暂时停止程序以运行OS(嗯,这是协作式多任务处理,但现在暂时忽略它).操作系统的工作方式是在CPU中设置硬件计时器以触发中断,然后告诉CPU执行程序.运行您的操作系统的正是此定时计时器中断.除了计时器外,OS(或更确切地说是设备驱动程序)还为I/O设置了中断.发生I/O事件时,操作系统将接管您的CPU(或多核系统中的一个CPU),并检查其数据结构以处理下一步处理I/O的过程(这称为抢占式多任务处理.

Interrupts make modern operating systems possible. Without interrupts there would be no way for the CPU to temporarily stop your program to run the OS (well, there is cooperative multitasking, but let's ignore that for now). How an OS works is that it sets up a hardware timer in the CPU to trigger an interrupt and then it tells the CPU to execute your program. It is this periodic timer interrupt that runs your OS. Apart form the timer, the OS (or rather device drivers) sets up interrupts for I/O. When an I/O event happens the OS will take over your CPU (or one of your CPU in a multi-core system) and checks against its data structure which process it needs to execute next to handle the I/O (this is called preemptive multitasking).

因此,处理网络连接甚至不是OS的工作-OS只是在其数据结构(或网络堆栈)中跟踪连接.真正处理网络I/O的是网卡,路由器,调制解调器,ISP等.因此,等待I/O占用的CPU资源为零.只需占用一些RAM即可记住哪个程序拥有哪个套接字.

So, handling network connections is not even the job of the OS - the OS just keeps track of connections in it's data structures (or rather, the networking stack). What really handles network I/O is your network card, your router, your modem, your ISP etc. So waiting for I/O takes zero CPU resources. It just takes up some RAM to remember which program owns which socket.

现在,我们对此有了清晰的了解,我们可以了解该节点的作用.各种操作系统具有提供异步I/O的各种不同API-从Windows上的重叠I/O到Linux上的轮询/轮询,再到BSD上的队列到跨平台select(). Node内部使用libuv作为这些API的高级抽象.

Now that we have a clear picture of this we can understand what it is that node does. Various OSes have various different APIs that provide asynchronous I/O - from overlapped I/O on Windows to poll/epoll on Linux to kqueue on BSD to the cross-platform select(). Node internally uses libuv as a high-level abstraction over these APIs.

这些API的工作方式相似,但细节有所不同.本质上,它们提供了一个函数,当调用该函数时,它将阻塞您的线程,直到OS向其发送事件为止.因此,是的,即使非阻塞I/O也会阻塞您的线程.这里的关键是阻塞I/O将在多个位置阻塞您的线程,但非阻塞I/O将仅在一个位置(您等待事件的位置)阻塞您的线程.

How these APIs work are similar though the details differ. Essentially they provide a function that when called will block your thread until the OS sends an event to it. So yes, even non-blocking I/O blocks your thread. The key here is that blocking I/O will block your thread in multiple places but non-blocking I/O blocks your thread in only one place - where you wait for events.

这允许您执行的操作是以面向事件的方式设计程序.这类似于中断使OS设计人员实现多任务处理的方式.实际上,异步I/O是针对框架的,而中断是针对OS的.它允许节点花费恰好0%的CPU时间来处理(等待)I/O.这就是使节点快速运行的原因-它并不是真正的更快,但不会浪费时间等待.

What this allows you to do is design your program in an event-oriented manner. This is similar to how interrupts allow OS designers to implement multitasking. In effect, asynchronous I/O is to frameworks what interrupts are to OSes. It allows node to spend exactly 0% CPU time to process (wait for) I/O. This is what makes node fast - it's not really faster but does not waste time waiting.

现在,我们已经了解了节点如何处理网络I/O,我们可以了解回调如何影响性能.

With the understanding we now have of how node handles network I/O we can understand how callbacks affect performance.

  1. CPU损失为零,数千个回调正在等待

  1. There is zero CPU penalty having thousands of callbacks waiting

当然,节点仍需要在RAM中维护数据结构以跟踪所有回调,因此回调确实会占用内存.

Of course, node still needs to maintain data structures in RAM to keep track of all the callbacks so callbacks do have memory penalty.

在单个线程中处理回调的返回值

Processing the return value from callbacks is done in a single thread

这具有一些优点和缺点.这意味着节点不必担心争用情况,因此节点在内部不使用任何信号量或互斥量来保护数据访问.缺点是任何占用大量CPU的JavaScript都将阻止所有其他操作.

This has some advantages and some drawbacks. It means node does not have to worry about race conditions and thus node does not internally use any semaphores or mutexes to guard data access. The disadvantage is that any CPU intensive javascript will block all other operations.

您提到:

我看到20个回调的性能从第一个回调到最后一个回调

I see the performance of my 20 callbacks deteriorate from the first callback to the last one

所有回调都在主线程中按顺序和同步执行(实际上,等待实际上是并行执行的).因此,可能是您的回调正在执行一些CPU密集型计算,而所有回调的总执行时间实际上是4秒.

The callbacks are all executed sequentially and synchronously in the main thread (only the waiting is actually done in parallel). Thus it could be that your callback is doing some CPU intensive calculations and the total execution time of all callbacks is actually 4 seconds.

但是,对于如此多的回调,我很少看到这种问题.仍然有可能,我仍然不知道您在回调中正在做什么.我只是认为这不太可能.

However, I rarely see this kind of issue for that number of callbacks. It's still possible, I still don't know what you're doing in your callbacks. I just think it's unlikely.

您还提到:

直到函数的回调返回值或错误

一个可能的解释是您的网络资源无法处理那么多同时连接.您可能不会认为这太多了,因为它只有20个连接,但是我已经看到很多服务会以10个请求/秒的速度崩溃.问题是所有20个请求都是同时进行的.

One likely explanation is that your network resource cannot handle that many simultaneous connections. You may not think it's much since it's only 20 connections but I've seen plenty of services that would crash at 10 requests/second. The problem is all 20 requests are simultaneous.

您可以通过从图片中删除节点并使用命令行工具发送20个同时请求来进行测试. curlwget之类的东西:

You can test this by taking node out of the picture and use a command line tool to send 20 simultaneous requests. Something like curl or wget:

# assuming you're running bash:
for x in `seq 1 20`;do curl -o /dev/null -w "Connect: %{time_connect} Start: %{time_starttransfer} Total: %{time_total} \n" http://example.com & done

缓解措施

如果事实证明问题是同时执行20个请求,则这会给其他服务带来压力,您可以做的是限制同时请求的数量.

Mitigation

If it turns out that the issue is doing the 20 requests simultaneously is stressing the other service what you can do is limit the number of simultaneous requests.

您可以通过批量处理请求来实现:

You can do this by batching your requests:

async function () {
    let input = [/* some values we need to process */];
    let result = [];

    while (input.length) {
        let batch = input.splice(0,3); // make 3 requests in parallel

        let batchResult = await Promise.all(batch.map(x => {
            return fetchNetworkResource(x);
        }));

        result = result.concat(batchResult);
    }
    return result;
}

这篇关于带有大量回调的NodeJS的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆