使用多个正则表达式分析单个字符串时,线程或多处理能否提高性能? [英] Could threading or multiprocessing improve performance when analyzing a single string with multiple regular expressions?

查看:48
本文介绍了使用多个正则表达式分析单个字符串时,线程或多处理能否提高性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我想用几十个正则表达式分析一个字符串,
线程或多处理模块可以提高性能吗?
换句话说,在多个线程或进程上分析字符串会比:

If I want to analyze a string using dozens of regular-expressions,
could either the threading or multiprocessing module improve performance?
In other words, would analyzing the string on multiple threads or processes be faster than:

match = re.search(regex1, string)
if match:
    afunction(match)
else:
    match = re.search(regex2, string)
    if match:
        bfunction(match)
    else:
        match = re.search(regex3, string)
        if match:
            cfunction(match)
...

匹配的正则表达式不会超过一个,所以这不是问题.
如果答案是多处理,你会推荐什么技术来研究(队列、管道)?

No more than one regular expression would ever match, so that's not a concern.
If the answer is multiprocessing, what technique would you recommend looking into (queues, pipes)?

推荐答案

Python 线程不会提高性能,因为 GIL 会阻止一次运行多个线程.如果您有一台多核机器,多个进程可能会加快速度,但前提是产生子进程和传递数据的成本低于执行 RE 搜索的成本.

Python threading won't improve performance because of the GIL which precludes more than one thread running at a time. If you have a multicore machine, it's possible that multiple processes may speed things up but only if the cost of spawning subprocesses and passing data around is less than the cost of performing your RE searches.

如果你经常这样做,你可能会研究线程池.

If you do this often, you might look into thread pools.

这篇关于使用多个正则表达式分析单个字符串时,线程或多处理能否提高性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆