一般来说,Node.js如何处理10,000个并发请求? [英] How, in general, does Node.js handle 10,000 concurrent requests?

查看:141
本文介绍了一般来说,Node.js如何处理10,000个并发请求?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解到Node.js使用单线程和事件循环来处理仅一次处理一个请求的请求(这是非阻塞的).但是,这是如何工作的,可以说有10,000个并发请求.事件循环会处理所有请求吗?那会不会花费太长时间?

I understand that Node.js uses a single-thread and an event loop to process requests only processing one at a time (which is non-blocking). But still, how does that work, lets say 10,000 concurrent requests. The event loop will process all the requests? Would not that take too long?

我还不了解(至今)它如何比多线程Web服务器更快.我知道多线程Web服务器的资源(内存,CPU)会更昂贵,但是会不会更快?我可能是错的.请说明在处理大量请求时此单线程的速度如何,以及在处理诸如10,000之类的大量请求时通常会执行的操作(高级).

I can not understand (yet) how it can be faster than a multi-threaded web server. I understand that multi-threaded web server will be more expensive in resources (memory, CPU), but would not it still be faster? I am probably wrong; please explain how this single-thread is faster in lots of requests, and what it typically does (in high level) when servicing lots of requests like 10,000.

而且,单线程在这么大的数量下会很好地扩展吗?请记住,我才刚刚开始学习Node.js.

And also, will that single-thread scale well with that large amount? Please bear in mind that I am just starting to learn Node.js.

推荐答案

如果您不得不问这个问题,那么您可能不熟悉大多数Web应用程序/服务的功能.您可能会认为所有软件都可以这样做:

If you have to ask this question then you're probably unfamiliar with what most web applications/services do. You're probably thinking that all software do this:

user do an action
       │
       v
 application start processing action
   └──> loop ...
          └──> busy processing
 end loop
   └──> send result to user

但是,这不是Web应用程序或实际上任何以数据库为后端的应用程序如何工作的方式.网络应用程序可以这样做:

However, this is not how web applications, or indeed any application with a database as the back-end, work. Web apps do this:

user do an action
       │
       v
 application start processing action
   └──> make database request
          └──> do nothing until request completes
 request complete
   └──> send result to user

在这种情况下,该软件将大部分运行时间都用0%的CPU时间来等待数据库返回.

In this scenario, the software spend most of its running time using 0% CPU time waiting for the database to return.

多线程网络应用程序可以像这样处理上述工作负载:

Multithreaded network apps handle the above workload like this:

request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request

因此,线程大部分时间都使用0%的CPU等待数据库返回数据.这样做时,他们不得不分配一个线程所需的内存,其中每个线程等都包含一个完全独立的程序堆栈.此外,他们还必须启动一个线程,尽管它并不像启动一个完整的进程那样昂贵.便宜.

So the thread spend most of their time using 0% CPU waiting for the database to return data. While doing so they have had to allocate the memory required for a thread which includes a completely separate program stack for each thread etc. Also, they would have to start a thread which while is not as expensive as starting a full process is still not exactly cheap.

由于我们大部分时间都花在0%的CPU上,所以为什么当我们不使用CPU时不运行一些代码?这样,每个请求仍将获得与多线程应用程序相同的CPU时间,但是我们不需要启动线程.因此,我们这样做:

Since we spend most of our time using 0% CPU, why not run some code when we're not using CPU? That way, each request will still get the same amount of CPU time as multithreaded applications but we don't need to start a thread. So we do this:

request ──> make database request
request ──> make database request
request ──> make database request
database request complete ──> send response
database request complete ──> send response
database request complete ──> send response

实际上,这两种方法都以大致相同的延迟返回数据,这是因为数据库响应时间决定了处理的时间.

In practice both approaches return data with roughly the same latency since it's the database response time that dominates the processing.

这里的主要优点是我们不需要产生新的线程,因此我们不需要执行大量的malloc都会使我们变慢.

The main advantage here is that we don't need to spawn a new thread so we don't need to do lots and lots of malloc which would slow us down.

看似神秘的事情是上述两种方法如何设法以并行"方式运行工作负载?答案是数据库是线程化的.因此,我们的单线程应用程序实际上是在利用另一个进程的多线程行为:数据库.

The seemingly mysterious thing is how both the approaches above manage to run workload in "parallel"? The answer is that the database is threaded. So our single-threaded app is actually leveraging the multi-threaded behaviour of another process: the database.

如果您需要在返回数据之前进行大量CPU计算,则单线程应用程序会失败很大.现在,我不是说要for循环来处理数据库结果.仍然大部分是O(n).我的意思是诸如执行傅立叶变换(例如,mp3编码),光线跟踪(3D渲染)之类的事情.

A singlethreaded app fails big if you need to do lots of CPU calculations before returning the data. Now, I don't mean a for loop processing the database result. That's still mostly O(n). What I mean is things like doing Fourier transform (mp3 encoding for example), ray tracing (3D rendering) etc.

单线程应用程序的另一个陷阱是它只会利用单个CPU内核.因此,如果您拥有四核服务器(当今并不常见),则您不会使用其他3核.

Another pitfall of singlethreaded apps is that it will only utilise a single CPU core. So if you have a quad-core server (not uncommon nowdays) you're not using the other 3 cores.

如果您需要为每个线程分配大量RAM,则多线程应用程序会失败很大.首先,RAM使用量本身意味着您无法处理与单线程应用程序一样多的请求.更糟糕的是,malloc很慢.分配很多对象(这在现代Web框架中很常见)意味着我们可能最终会比单线程应用程序慢.这是node.js通常获胜的地方.

A multithreaded app fails big if you need to allocate lots of RAM per thread. First, the RAM usage itself means you can't handle as many requests as a singlethreaded app. Worse, malloc is slow. Allocating lots and lots of objects (which is common for modern web frameworks) means we can potentially end up being slower than singlethreaded apps. This is where node.js usually win.

一个最终使多线程恶化的用例是,当您需要在线程中运行另一种脚本语言时.首先,您通常需要为该语言分配整个运行时,然后需要分配脚本使用的变量.

One use-case that end up making multithreaded worse is when you need to run another scripting language in your thread. First you usually need to malloc the entire runtime for that language, then you need to malloc the variables used by your script.

因此,如果您使用C或go或java编写网络应用程序,则线程的开销通常不会太糟.如果您要编写C Web服务器来服务PHP或Ruby,那么用javascript,Ruby或Python编写速度更快的服务器非常容易.

So if you're writing network apps in C or go or java then the overhead of threading will usually not be too bad. If you're writing a C web server to serve PHP or Ruby then it's very easy to write a faster server in javascript or Ruby or Python.

某些Web服务器使用混合方法.例如,Nginx和Apache2将其网络处理代码实现为事件循环的线程池.每个线程运行一个事件循环,同时处理单线程请求,但请求在多个线程之间进行负载均衡.

Some web servers use a hybrid approach. Nginx and Apache2 for example implement their network processing code as a thread pool of event loops. Each thread runs an event loop simultaneously processing requests single-threaded but requests are load-balanced among multiple threads.

某些单线程体系结构也使用混合方法.您可以启动多个应用程序,而不是从单个进程启动多个线程,例如,在四核计算机上启动4个node.js服务器.然后,您可以使用负载平衡器在各个进程之间分配工作负载.

Some single-threaded architectures also use a hybrid approach. Instead of launching multiple threads from a single process you can launch multiple applications - for example, 4 node.js servers on a quad-core machine. Then you use a load balancer to spread the workload amongst the processes.

实际上,这两种方法在技术上都是彼此相同的镜像.

In effect the two approaches are technically identical mirror-images of each other.

这篇关于一般来说,Node.js如何处理10,000个并发请求?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆