Python:如何在python中运行嵌套并行进程? [英] Python: How to run nested parallel process in python?
问题描述
我有一个交易者交易的数据集df
.
我有2个级别的for循环,如下所示:
I have a dataset df
of trader transactions.
I have 2 levels of for loops as follows:
smartTrader =[]
for asset in range(len(Assets)):
df = df[df['Assets'] == asset]
# I have some more calculations here
for trader in range(len(df['TraderID'])):
# I have some calculations here, If trader is successful, I add his ID
# to the list as follows
smartTrader.append(df['TraderID'][trader])
# some more calculations here which are related to the first for loop.
我想并行化Assets
中每种资产的计算,并且我还希望并行化每种资产的每个交易者的计算.完成所有这些计算后,我想基于smartTrader
的列表进行其他分析.
I would like to parallelise the calculations for each asset in Assets
, and I also want to parallelise the calculations for each trader for every asset. After ALL these calculations are done, I want to do additional analysis based on the list of smartTrader
.
这是我第一次尝试并行处理,所以请耐心等待,感谢您的帮助.
This is my first attempt at parallel processing, so please be patient with me, and I appreciate your help.
推荐答案
如果使用提供了multiprocessing
分支的pathos
,则可以轻松嵌套并行映射. pathos
用于轻松测试嵌套的并行映射的组合-嵌套的并行映射是for循环的直接转换.
它提供了阻塞,非阻塞,迭代,异步,串行,并行和分布式的映射选择.
If you use pathos
, which provides a fork of multiprocessing
, you can easily nest parallel maps. pathos
is built for easily testing combinations of nested parallel maps -- which are direct translations of nested for loops.
It provides a selection of maps that are blocking, non-blocking, iterative, asynchronous, serial, parallel, and distributed.
>>> from pathos.pools import ProcessPool, ThreadPool
>>> amap = ProcessPool().amap
>>> tmap = ThreadPool().map
>>> from math import sin, cos
>>> print amap(tmap, [sin,cos], [range(10),range(10)]).get()
[[0.0, 0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282, -0.9589242746631385, -0.27941549819892586, 0.6569865987187891, 0.9893582466233818, 0.4121184852417566], [1.0, 0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354, -0.9111302618846769]]
在此示例中,使用了一个处理池和一个线程池,其中线程映射调用处于阻塞状态,而处理映射调用是异步的(请注意最后一行末尾的get
).
Here this example uses a processing pool and a thread pool, where the thread map call is blocking, while the processing map call is asynchronous (note the get
at the end of the last line).
在此处获取pathos
: https://github.com/uqfoundation
或搭配:
$ pip install git+https://github.com/uqfoundation/pathos.git@master
Get pathos
here: https://github.com/uqfoundation
or with:
$ pip install git+https://github.com/uqfoundation/pathos.git@master
这篇关于Python:如何在python中运行嵌套并行进程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!