Python中的并行性 [英] Parallelism in Python

查看:85
本文介绍了Python中的并行性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python中实现并行性有哪些选择?我想对一些非常大的栅格执行一堆CPU限制的计算,并希望将它们并行化.来自C的背景,我熟悉三种并行处理方法:

What are the options for achieving parallelism in Python? I want to perform a bunch of CPU bound calculations over some very large rasters, and would like to parallelise them. Coming from a C background, I am familiar with three approaches to parallelism:

  1. 消息传递过程,可能分布在整个集群中,例如 MPI .
  2. 使用 pthreads fork() pipe()等显式共享内存并行性.
  3. 使用 OpenMP 隐式共享内存并行性.
  1. Message passing processes, possibly distributed across a cluster, e.g. MPI.
  2. Explicit shared memory parallelism, either using pthreads or fork(), pipe(), et. al
  3. Implicit shared memory parallelism, using OpenMP.

决定使用方法是一种权衡的做法.

Deciding on an approach to use is an exercise in trade-offs.

在Python中,有哪些可用的方法,它们的特点是什么?是否存在可群集的 MPI 克隆?实现共享内存并行性的首选方式是什么?我听说过有关 GIL 的问题以及有关 tasklet 的问题.

In Python, what approaches are available and what are their characteristics? Is there a clusterable MPI clone? What are the preferred ways of achieving shared memory parallelism? I have heard reference to problems with the GIL, as well as references to tasklets.

简而言之,在它们之间进行选择之前,我需要了解有关Python中不同的并行化策略的什么信息?

In short, what do I need to know about the different parallelization strategies in Python before choosing between them?

推荐答案

通常,您描述一个CPU绑定计算.这不是Python的优势.从历史上看,两者都不是多处理.

Generally, you describe a CPU bound calculation. This is not Python's forte. Neither, historically, is multiprocessing.

主流的Python解释器中的线程已由可怕的全局锁支配.新的 multiprocessing API可以解决此问题,并提供带有管道和队列等的工作池抽象

Threading in the mainstream Python interpreter has been ruled by a dreaded global lock. The new multiprocessing API works around that and gives a worker pool abstraction with pipes and queues and such.

您可以在 C Cython ,并使用Python粘合.

You can write your performance critical code in C or Cython, and use Python for the glue.

这篇关于Python中的并行性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆