在选择多处理进程的数量时,是否有任何要遵循的准则? [英] Are there any guidelines to follow when choosing number of processes with multiprocessing?

查看:38
本文介绍了在选择多处理进程的数量时,是否有任何要遵循的准则?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始接触多处理(而且它非常棒!),但我想知道是否有任何选择进程数量的指南?是否仅基于服务器上的内核数?它是否以某种方式基于您正在运行的应用程序(循环数,它使用了多少 cpu 等)?等等......我如何决定产生多少进程?现在,我只是猜测和添加/删除流程,但如果有某种指南或最佳实践,那就太好了.

I'm just getting my feet wet with multiprocessing(and its totally awesome!), but I was wondering if there was any guidelines to selecting number of processes? Is it just based on number of cores on the server? Is it somehow based on the application your running(number of loops, how much cpu it uses,etc)? etc...how do I decide how many processes to spawn? Right now, I'm just guessing and add/removing processes but it would be great if there was some kind of guideline or best practice.

另一个问题,我知道如果我添加太少会发生什么(程序很慢)但是如果我添加太多"怎么办?

Another question, I know what happens if I add too few(program is slooow) but what if I add 'too many'?

谢谢!

推荐答案

如果您的所有线程/进程确实都受 CPU 限制,那么您应该运行与 CPU 报告的内核数一样多的进程.由于 HyperThreading,每个物理 CPU 内核可能能够呈现多个虚拟内核.调用 multiprocessing.cpu_count 获取数字虚拟核心数.

If all of your threads/processes are indeed CPU-bound, you should run as many processes as the CPU reports cores. Due to HyperThreading, each physical CPU cores may be able to present multiple virtual cores. Call multiprocessing.cpu_count to get the number of virtual cores.

如果只有 p 1 个线程受 CPU 限制,您可以通过乘以 p 来调整该数字.例如,如果您的一半进程受 CPU 限制 (p = 0.5),并且您有两个 CPU,每个 CPU 具有 4 个内核和 2 个超线程,则您应该启动 0.5 * 2 * 4 * 2 = 8 个进程.

If only p of 1 of your threads is CPU-bound, you can adjust that number by multiplying by p. For example, if half your processes are CPU-bound (p = 0.5) and you have two CPUs with 4 cores each and 2x HyperThreading, you should start 0.5 * 2 * 4 * 2 = 8 processes.

如果您的进程太少,您的应用程序的运行速度将比预期的要慢.如果您的应用程序可完美扩展并且仅受 CPU 限制(即在 10 倍的核心数量上执行时速度快 10 倍),这意味着您的速度相对较慢.例如,如果您的系统需要 8 个进程,但您只启动了 4 个,那么您将只使用一半的处理能力并花费两倍的时间.请注意,在实践中,没有任何应用程序可以完美扩展,但有些应用程序(光线追踪、视频编码)非常接近.

If you have too few process, your application will run slower than expected. If your application scales perfectly and is only CPU-bound (i.e. is 10 times faster when executed on 10 times the amount of cores), this means you the speed is slower in relation. For example, if your system calls for 8 processes, but you only initiate 4, you'll only use half of the processing capacity and take twice as long. Note that in practice, no application scales perfectly, but some (ray tracing, video encoding) are pretty close.

如果进程太多,同步开销会增加.如果您的程序几乎没有同步开销,这不会影响整体运行时间,但可能会使其他程序看起来比它们慢,除非您将进程设置为较低的优先级.如果您的操作系统具有良好的调度程序,则理论上过多的进程数(例如 10000)是可以的.实际上,几乎任何同步都会使开销难以承受.

If you have too many processes, the synchronization overhead will increase. If your program is little to none synchronization overhead, this won't impact the overall runtime, but may make other programs appear slower than they are unless you set your processes to a lower priority. Excessive numbers of processes (say, 10000) are fine in theory if your OS has a good scheduler. In practice, virtually any synchronization will make the overhead unbearable.

如果您不确定您的应用程序是否受 CPU 限制和/或完美扩展,只需观察具有不同线程数的系统负载.您希望系统负载略低于 100%,或者更精确的正常运行时间虚拟核心数.

If you're not sure whether your application is CPU-bound and/or perfectly scaling, simply observe system load with different thread counts. You want the system load to be slightly under 100%, or the more precise uptime to be the number of virtual cores.

这篇关于在选择多处理进程的数量时,是否有任何要遵循的准则?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆