使用 ThreadPool Python 时的最大池大小 [英] Maximum pool size when using ThreadPool Python

查看:34
本文介绍了使用 ThreadPool Python 时的最大池大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 ThreadPool 来实现多处理.使用多处理时,池大小限制应等于 CPU 内核数.我的问题 - 使用 ThreadPool 时,池大小限制应该是 CPU 内核数吗?

I am using ThreadPool to achieve multiprocessing. When using multiprocessing, pool size limit should be equivalent to number of CPU cores. My question- When using ThreadPool, should the pool size limit be number of CPU cores?

这是我的代码

from multiprocessing.pool import ThreadPool as Pool
class Subject():
    def __init__(self, url):
       #rest of the code
   def func1(self):
      #returns something
if __name__=="__main__":
   pool_size= 11
   pool= Pool(pool_size)
   objects= [Subject() for url in all_my_urls]
   for obj in objects:
     pool.apply_async(obj.func1, ())
   pool.close()
   pool.join()

最大池大小应该是多少?提前致谢.

What should be the maximum pool size be? Thanks in advance.

推荐答案

不能使用线程进行多处理,只能实现多线程.由于 GIL,多个线程不能在单个 Python 进程中同时运行,因此多线程仅在运行 IO 繁重的工作(例如与 Internet 交谈)并花费大量时间等待时才有用,而不是 CPU 繁重的工作(例如数学)不断占据核心.

You cannot use threads for multiprocessing, you can only achieve multithreading. Multiple threads cannot run concurrently in a single Python process because of the GIL and so multithreading is only useful if they are running IO heavy work (e.g. talking to the Internet) where they spend a lot of time waiting, rather than CPU heavy work (e.g. maths) which constantly occupies a core.

因此,如果您同时运行许多 IO 繁重的任务,那么拥有这么多线程将很有用,即使它超过 CPU 内核的数量.大量线程最终会对性能产生负面影响,但在您实际测量问题之前不要担心.大约 100 个线程应该没问题.

So if you have many IO heavy tasks running at once then having that many threads will be useful, even if it's more than the the number of CPU cores. A very large number threads will eventually have a negative impact on performance, but until you actually measure a problem don't worry. Something like 100 threads should be fine.

这篇关于使用 ThreadPool Python 时的最大池大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆