并行运行Python脚本 [英] Running Python script parallel

查看：86 发布时间：2020/5/13 19:50:54 python multithreading multiprocessing

本文介绍了并行运行Python脚本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个庞大的视频数据集，我使用名为process.py的python脚本处理了这些视频.问题在于处理包含6000个视频的所有数据集需要花费大量时间.因此，我想到了将这个数据集划分为4的想法，并将相同的代码复制到不同的Python脚本(例如process1.py，process2.py，process3.py，process3.py)，然后在不同的shell上运行每个代码与数据集的一部分.

I have a huge dataset of videos that I process using a python script called process.py. The problem is it takes a lot of time to process all the dataset which contains 6000 videos. So, I came up with the idea of dividing this dataset for example into 4 and copy the same code to different Python scripts (e.g. process1.py, process2.py, process3.py, process3.py) and run each one on different shells with one portion of the dataset.

我的问题是，这会给我带来什么绩效吗?我有一台10核的机器，所以如果我能以某种方式利用这种多核结构，那将是非常有益的.我听说过Python的multiprocessing模块，但是不幸的是，我对其了解不多，考虑到我会使用它的功能，所以我没有编写脚本.在不同的shell中启动每个脚本的想法是胡说八道吗?有没有办法选择每个脚本将使用哪个内核?

My question is would that bring me anything in terms of performance? I have a machine with 10 cores so it would be very beneficial if I could somehow exploit this multicore structure. I heard about multiprocessing module of Python but unfortunately, I don't know much about it and I didn't write my script considering that I would use its features. Is the idea of starting each script in different shells nonsense? Is there a way to choose which core would be used by each script?

推荐答案

multiprocessing文档( https://docs.python. org/2/library/multiprocessing.html#using-a-pool-of-workers )应该特别相关

The multiprocessing documentation ( https://docs.python.org/2/library/multiprocessing.html) is actually fairly easy to digest. This section (https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers) should be particularly relevant

您绝对不需要同一脚本的多个副本.您可以采用以下方法:

You definitely do not need multiple copy of the same script. This is an approach you can adopt:

假定它是现有脚本(process.py)的常规结构.

Assume it is the general structure of your existing script (process.py).

def convert_vid(fname):
    # do the heavy lifting
    # ...

if __name__ == '__main__':
   # There exists VIDEO_SET_1 to 4, as mentioned in your question
   for file in VIDEO_SET_1:  
       convert_vid(file)

使用multiprocessing，您可以在单独的进程中触发功能convert_vid.这是一般的方案:

With multiprocessing, you can fire the function convert_vid in seperate processes. Here is the general scheme:

from multiprocessing import Pool

def convert_vid(fname):
    # do the heavy lifting
    # ...

if __name__ == '__main__':
   pool = Pool(processes=4) 
   pool.map(convert_vid, [VIDEO_SET_1, VIDEO_SET_2, VIDEO_SET_3, VIDEO_SET_4])

这篇关于并行运行Python脚本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

并行运行Python脚本 [英] Running Python script parallel

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

并行运行Python脚本 [英] Running Python script parallel

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭