值得使用 Python 的 re.compile 吗? [英] Is it worth using Python's re.compile?

查看:67
本文介绍了值得使用 Python 的 re.compile 吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python 中对正则表达式使用 compile 有什么好处吗?

Is there any benefit in using compile for regular expressions in Python?

h = re.compile('hello')
h.match('hello world')

对比

re.match('hello', 'hello world')

推荐答案

我有很多运行已编译正则表达式 1000 次与动态编译的经验,并且没有注意到任何可察觉的差异.显然,这是轶事,当然不是反对编译的好论据,但我发现差异可以忽略不计.

I've had a lot of experience running a compiled regex 1000s of times versus compiling on-the-fly, and have not noticed any perceivable difference. Obviously, this is anecdotal, and certainly not a great argument against compiling, but I've found the difference to be negligible.

在快速浏览了实际的 Python 2.5 库代码后,我发现无论何时使用它们(包括对 re.match() 的调用),Python 都会在内部编译和缓存正则表达式,因此您实际上只是在编译正则表达式时更改,并且根本不应该节省太多时间 - 只是检查缓存所需的时间(对内部 dict 类型的键查找).

After a quick glance at the actual Python 2.5 library code, I see that Python internally compiles AND CACHES regexes whenever you use them anyway (including calls to re.match()), so you're really only changing WHEN the regex gets compiled, and shouldn't be saving much time at all - only the time it takes to check the cache (a key lookup on an internal dict type).

来自模块 re.py(评论是我的):

From module re.py (comments are mine):

def match(pattern, string, flags=0):
    return _compile(pattern, flags).match(string)

def _compile(*key):

    # Does cache check at top of function
    cachekey = (type(key[0]),) + key
    p = _cache.get(cachekey)
    if p is not None: return p

    # ...
    # Does actual compilation on cache miss
    # ...

    # Caches compiled regex
    if len(_cache) >= _MAXCACHE:
        _cache.clear()
    _cache[cachekey] = p
    return p

我仍然经常预编译正则表达式,但只是为了将它们绑定到一个漂亮的、可重用的名称,而不是为了任何预期的性能提升.

I still often pre-compile regular expressions, but only to bind them to a nice, reusable name, not for any expected performance gain.

这篇关于值得使用 Python 的 re.compile 吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆