亚马逊之间的云实例的Python多 [英] Python multiprocessing BETWEEN Amazon cloud instances

查看:245
本文介绍了亚马逊之间的云实例的Python多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我期待运行长时间运行的Python分析过程的几个亚马逊EC2实例。在code已经运行使用Python 模块,并可以在一台机器上的优势,所有的内核。

I'm looking to run a long-running python analysis process on a few Amazon EC2 instances. The code already runs using the python multiprocessing module and can take advantage of all cores on a single machine.

的分析是完全中并行并且每个实例不需要与任何其他人进行通信。所有的工作是基于文件和每道工序的工作原理上的每个文件indivually ...所以我打算只安装在所有节点的相同S3音量。

The analysis is completely parellel and each instance does not need to communicate with any of the others. All of the work is "file-based" and each process works on each file indivually ... so I was planning on just mounting the same S3 volume across all of the nodes.

我想知道是否有人知道的任何教程(或者有任何建议)用于设置多重环境,让我可以在同一时间的计算实例中任意数量的运行。

I was wondering if anyone knew of any tutorials (or had any suggestions) for setting up the multiprocessing environment so I can run it on an arbitrary number of compute-instances at the same time.

推荐答案

的文档给你一个很好的设置为的运行在多台机器多的。使用S3是共享的文件在EC2实例,但多处理可以共享队列和传递数据的好办法。

the docs give you a good setup for running multiprocessing on multiple machines. Using s3 is a good way to share files across ec2 instances, but with multiprocessing you can share queues and pass data.

如果您可以使用Hadoop的并行任务,这是一个很好的方法来提取机器之间的并行性,但是如果你需要很多的IPC然后建立自己的解决方案与多没有那么糟。

if you can use hadoop for parallel tasks, it is a very good way to extract parallelism across machines, but if you need a lot of IPC then building your own solution with multiprocessing isn't that bad.

只是确保你把你的机器在同一个安全组:-)

just make sure you put your machines in the same security groups :-)

这篇关于亚马逊之间的云实例的Python多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆