如何将集群/生成子进程技术应用于具有 IO 绑定和 CPU 绑定任务的 Node.js 应用程序? [英] How to apply clustering/spawing child process techniques for Node.js application having bouth IO bound and CPU bound tasks?

查看:32
本文介绍了如何将集群/生成子进程技术应用于具有 IO 绑定和 CPU 绑定任务的 Node.js 应用程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个 IOT 项目,其中 Node.js 应用程序执行以下任务:

1.使用异步消息库(IO绑定)读取消息流
2. 根据 Node.js 应用程序发送的消息将消息发送到进行机器学习的 Web 服务(IO 绑定,因为只涉及 API 调用)
3. 从 Web 服务接收机器学习生成的模式(使用 REST API)4. 将模式与实时流消息进行比较(CPU 密集型,因为模式匹配涉及复杂的算法).
5. 记录堆栈跟踪(IO 绑定)

I'm working on a IOT project where the Node.js application perform following tasks:

1. Reading stream of messages using asynchronous messaging library (IO bound)
2. Sending the messages to web service where machine learning happens based on the messages that were sent by Node.js application (IO bound as only API call is involved)
3. Receive the pattern generated as a result of machine learning from web service (using REST API) 4. Compare the pattern against the real-time streaming messages (CPU intensive as complex algorithms are involved for pattern matching).
5. Logging stack traces (IO bound)

将开发一个 node.js 应用程序,将这些功能作为默认在单线程下运行的独立任务.事实上,产生子进程仅对 CPU 密集型任务有用,如何为 node.js 进程做集群,同时处理 IO 和 CPU 密集型任务?我们需要在这个 node.js 应用程序上部分执行集群吗?

A node.js application is going to be developed to have these functionalities as separate tasks running under a single-thread by default. Being the fact that, spawning the child process will be useful only for CPU intensive tasks, how to to do clustering for node.js process doint both IO and CPU bound tasks? Do we need to partially perform clustering on this node.js application?

谁能建议这个 node.js 应用程序的有效架构?

Can anyone please suggest the effective architecture for this node.js application?

推荐答案

如果您有任何 CPU 密集型任务,请对所有请求使用集群.

If you have ANY CPU-intensive tasks, then use clustering for all requests.

集群进程也在做一些 I/O 密集型事情的事实不会伤害你,但你会希望集群进程处理 CPU 密集型的事情.因此,只需将您的服务器集群化,让每个集群处理请求的全部负载(I/O 和 CPU 负载).

The fact that a clustered process is also doing some I/O intensive stuff won't hurt you, but you will want the clustered process for the CPU intensive stuff. So, just make your server clustered and let each cluster handle the whole load of a request (both the I/O and the CPU stuff).

简而言之,CPU 密集型的东西是集群的主要驱动力.如果集群进程也在执行非阻塞 I/O,它不会有任何伤害.事实上,在高负载情况下,最多可使用 CPU 数量的集群甚至可以帮助 I/O 密集型进程(尽管没有 CPU 密集型进程的帮助那么大).

In a nutshell, CPU-intensive stuff is the primary driver for clustering. It doesn't hurt anything if the clustered processes are also doing non-blocking I/O. In fact, clustering up to the number of CPUs available can even help I/O bound processes some too in high load situations (though not nearly as much help as with CPU-intensive processes).

另一种选择,虽然它可能是一个更复杂的实现,是仅将子进程或新的工作线程用于请求处理的 CPU 密集型部分.在这种情况下,您将创建某种工作队列和一组子进程或工作线程,用于在队列中执行操作,并且您的主进程将从队列中将任务分配给每个子进程.使用此方案,您可以准确决定哪些代码通过工作队列执行,哪些代码保留在主进程中,但您现在必须使用某种进程间通信在两者之间进行协调.

An alternative, though it may be a more complicated implementation, is to use child processes or the new Worker threads only for the CPU-intensive parts of your request handling. In that case, you'd create some sort of work queue and a set of child processes or Worker threads for performing operations in the queue and your master process would distribute tasks to each child process from the queue. Using this scheme, you can decide exactly which code is executed via the work queue and which code stays in the main process, though you now have to coordinate between the two using some sort of interprocess communication.

这篇关于如何将集群/生成子进程技术应用于具有 IO 绑定和 CPU 绑定任务的 Node.js 应用程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆