需要在Windows/Python中非常快速地创建大量新进程 [英] Need to create a large number of new processes very quickly in Windows/Python

查看:252
本文介绍了需要在Windows/Python中非常快速地创建大量新进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了测试某些安全软件,我需要能够在Windows中快速创建大量(可配置)新进程(而不是线程!),让它们存在(可配置)一段时间,然后干净地终止.这些过程根本不做任何事情-在指定的持续时间内存在.

In order to test some security software, I need to be able to create a large (configurable) number of new processes (not threads!) in Windows, very quickly, have them exist for a (configurable) period of time, then terminate cleanly. The processes shouldn't do anything at all - just exist for the specified duration.

最终,我希望能够运行以下内容:

Ultimately, I want to be able to run something like:

C:\> python process_generate.py --processes=150 --duration=2500

这将非常快地创建150个新进程,使它们全部存活2500ms,然后使其尽快终止.

which would create 150 new processes very quickly, keep them all alive for 2500ms, then have them all terminate as quickly as possible.

作为起点,我参加了比赛

As a starting point, I ran

from multiprocessing import Process
import os

def f():
    pass

if __name__ == '__main__':
    import datetime
    count = 0
    startime = datetime.datetime.now()
    while True:
        p = Process(target=f)
        p.start()
        p.terminate()
        count += 1
        if count % 1000 == 0:
            now = datetime.datetime.now()
            print "Started & stopped d processes in %s seconds" % (count, str(now-starttime))

发现我可以在笔记本电脑上每秒创建和终止大约70个进程 ,创建的进程可以直接终止.大约一个小时的时间里,约有70个处理/秒的速度得以维持.

and found I could create and terminate about 70 processes/second serially on my laptop, with the created processes terminating straightaway. The approx 70 processes/second rate was sustained over about an hour duration.

当我将代码更改为

from multiprocessing import Process
import os
import time

def f_sleep():
    time.sleep(1)

if __name__ == '__main__':
    import datetime
    starttime = datetime.datetime.now()

    processes = []
    PROCESS_COUNT = 100
    for i in xrange(PROCESS_COUNT):
        p = Process(target=f_sleep)
        processes.append(p)
        p.start()
    for i in xrange(PROCESS_COUNT):
        processes[i].terminate()
    now = datetime.datetime.now()
    print "Started/stopped %d processes in %s seconds" % (len(processes), str(now-starttime))

并为PROCESS_COUNT尝试了不同的值,我希望它的扩展性要好得多.对于不同的PROCESS_COUNT值,我得到了以下结果:

and tried different values for PROCESS_COUNT, I expected it to scale a lot better than it did. I got the following results for different values of PROCESS_COUNT:

  • 0.72秒内完成20个流程
  • 1.45秒内完成30个流程
  • 3.68秒内完成50个流程
  • 在14秒内完成100个流程
  • 43秒内完成200个流程
  • 77秒内完成300个流程
  • 400个流程在111秒内完成

这不是我期望的 -我期望能够以合理的线性方式扩展 parallel 进程数,直到遇到瓶颈,但我似乎几乎立即就触及了流程创建的瓶颈.我绝对希望能够根据我运行的第一个代码,在达到流程创建瓶颈之前,每秒能够创建接近70个流程的东西.

This is not what I expected - I expected to be able to scale up the parallel process count in a reasonably linear fashion till I hit a bottleneck, but I seem to be hitting a process creation bottleneck almost straightaway. I definitely expected to be able to create something close to 70 processes/second before hitting a process creation bottleneck, based on the first code I ran.

如果不遵循完整的规格,笔记本电脑将运行完全修补的Windows XP,4Gb RAM,否则处于空闲状态,并且是相当新的产品.我认为这不会很快成为瓶颈.

Without going into the full specs, the laptop runs fully patched Windows XP, has 4Gb RAM, is otherwise idle and is reasonably new; I don't think it'd be hitting a bottleneck this quickly.

我在这里用代码做任何明显错误的事情,还是XP/Python并行进程创建真的在12个月大的笔记本电脑上效率低下吗?

Am I doing anything obviously wrong here with my code, or is XP/Python parallel process creation really that inefficient on a 12 month old laptop?

推荐答案

在分析和测试了许多不同的场景之后,我发现在Windows下生成和杀死单个进程比在N处生成N个进程要快得多.一次,杀死所有N,然后重新启动N.

After profiling and testing a bunch of different scenarios, I found that it's simply far faster to be generating and killing single processes under Windows, rather than generating N processes at once, killing all N, and restarting N again.

我的结论是Windows保留了足够的资源来能够一次快速启动1个进程,但不足以立即启动1个以上新的并发进程.就像其他人所说的那样,Windows启动新进程的速度很慢,但是显然速度会随着系统上已经运行的并发进程的数量而呈半几何级下降-启动单个进程非常快,但是当您启动多个进程时你遇到问题了.不管存在多少CPU,计算机有多忙(在我的测试中通常为5%以下的CPU),Windows是在物理服务器上运行还是在虚拟服务器上运行,可用的内存量(我测试了多达32 Gb RAM,大约有24 Gb空闲空间),...-这似乎只是Windows操作系统的局限性.当我在相同的硬件上安装Linux时,限制消失了(根据Xavi的回答),我们能够非常快地同时启动许多进程.

My conclusion is that Windows keeps enough resource available to be able to start 1 process at a time quite quickly, but not enough to start >1 new concurrent processes without considerable delay. As others have said, Windows is slow at starting new processes, but apparently the speed degrades semi-geometrically with the number of concurrent processes already running on the system - starting a single process is quite fast, but when you're kicking off multiple processes you hit problems. This applies regardless of the number of CPUs that exist, how busy the machine is (typically <5% CPU in my testing), whether Windows is running on a physical server or virtual, how much RAM is available (I tested with up to 32Gb RAM, with ~24Gb free), ... - it simply seems to be a limitation of the Windows OS. When I installed Linux on the same hardware, the limitation went away (as per Xavi's response) and we were able to start many processes concurrently, very quickly.

这篇关于需要在Windows/Python中非常快速地创建大量新进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆