Python 性能:Try-except 或 not in? [英] Python performance: Try-except or not in?

查看:76
本文介绍了Python 性能:Try-except 或 not in?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的一个类中,我有许多方法都从相同的字典中提取值.但是,如果其中一个方法试图访问一个不存在的值,它必须调用另一个方法来使该值与该键相关联.

In one of my classes I have a number of methods that all draw values from the same dictionaries. However, if one of the methods tries to access a value that isn't there, it has to call another method to make the value associated with that key.

我目前实现如下,其中 findCrackDepth(tonnage) 为 self.lowCrackDepth[tonnage] 分配一个值.

I currently have this implemented as follows, where findCrackDepth(tonnage) assigns a value to self.lowCrackDepth[tonnage].

if tonnage not in self.lowCrackDepth:
    self.findCrackDepth(tonnage)
lcrack = self.lowCrackDepth[tonnage]

但是,我也可以这样做

try:
    lcrack = self.lowCrackDepth[tonnage]
except KeyError:
    self.findCrackDepth(tonnage)
    lcrack = self.lowCrackDepth[tonnage]

我认为两者之间的性能差异与字典中值已经存在的频率有关.这个差别有多大?我正在生成几百万个这样的值(分布在类的许多实例中的许多字典中),并且每次该值不存在时,可能有两次它存在.

I assume there is a performance difference between the two related to how often the values is already in the dictionary. How big is this difference? I'm generating a few million such values (spread across a many dictionaries in many instances of the class), and for each time the value doesn't exist, there are probably two times where it does.

推荐答案

这是一个微妙的问题,因为您需要小心避免持久的副作用",并且性能权衡取决于丢失键的百分比.因此,考虑如下 dil.py 文件:

It's a delicate problem to time this because you need care to avoid "lasting side effects" and the performance tradeoff depends on the % of missing keys. So, consider a dil.py file as follows:

def make(percentmissing):
  global d
  d = dict.fromkeys(range(100-percentmissing), 1)

def addit(d, k):
  d[k] = k

def with_in():
  dc = d.copy()
  for k in range(100):
    if k not in dc:
      addit(dc, k)
    lc = dc[k]

def with_ex():
  dc = d.copy()
  for k in range(100):
    try: lc = dc[k]
    except KeyError:
      addit(dc, k)
      lc = dc[k]

def with_ge():
  dc = d.copy()
  for k in range(100):
    lc = dc.get(k)
    if lc is None:
      addit(dc, k)
      lc = dc[k]

和一系列 timeit 调用,例如:

and a series of timeit calls such as:

$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_in()'
10000 loops, best of 3: 28 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_ex()'
10000 loops, best of 3: 41.7 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_ge()'
10000 loops, best of 3: 46.6 usec per loop

这表明,如果缺少 10% 的键,in 检查实际上是最快的方法.

this shows that, with 10% missing keys, the in check is substantially the fastest way.

$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_in()'
10000 loops, best of 3: 24.6 usec per loop
$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_ex()'
10000 loops, best of 3: 23.4 usec per loop
$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_ge()'
10000 loops, best of 3: 42.7 usec per loop

只有 1% 的键丢失,exception 方法略微最快(并且 get 方法在任何一种情况下都是最慢的).

with just 1% missing keys, the exception approach is marginally fastest (and the get approach remains the slowest one in either case).

因此,为了获得最佳性能,除非绝大多数(99%+)的查找会成功,否则in 方法更可取.

So, for optimal performance, unless the vast majority (99%+) of lookups is going to succeed, the in approach is preferable.

当然,还有另一种优雅的可能性:添加一个 dict 子类,如...:

Of course, there's another, elegant possibility: adding a dict subclass like...:

class dd(dict):
   def __init__(self, *a, **k):
     dict.__init__(self, *a, **k)
   def __missing__(self, k):
     addit(self, k)
     return self[k]

def with_dd():
  dc = dd(d)
  for k in range(100):
    lc = dc[k]

然而……:

$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_dd()'
10000 loops, best of 3: 46.1 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_dd()'
10000 loops, best of 3: 55 usec per loop

...虽然确实很漂亮,但这并不是性能赢家——即使使用 get 方法,或者更慢,只是使用看起来更漂亮的代码来使用它.(defaultdict,在语义上类似于这个 dd 类,如果它适用,将是一个性能上的胜利,但那是因为 __missing__ 特殊方法,在在这种情况下,是用经过优化的 C 代码实现的).

...while slick indeed, this is not a performance winner -- it's about even with the get approach, or slower, just with much nicer-looking code to use it. (defaultdict, semantically analogous to this dd class, would be a performance win if it was applicable, but that's because the __missing__ special method, in that case, is implemented in well optimized C code).

这篇关于Python 性能:Try-except 或 not in?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆