加速执行,Python [英] Speed Up Execution ,Python

查看:135
本文介绍了加速执行,Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

循环对于执行时间来说是相当昂贵的。我正在构建一个修正算法,并使用了peter norvig的拼写修正代码。我修改了一下,意识到执行数千字优化所花的时间太长。



该算法检查1和2编辑距离并对其进行修正。我已经做了3。所以这可能会增加时间(我不确定)。这里是最高出现的单词用作参考的一部分:

  def正确(单词):$ b $ (知道([word])。union(known(edits1(word))))。union(known_edits2(word).union(known_edits3(word))或[word])#这是问题的地方

candidate_new = []
候选人候选人:#这个语句不是问题
如果soundex(候选人)== soundex(单词):
candidate_new.append(候选人)
return max(candidate_new,key =(NWORDS.get))

看起来像候选人候选人的语句正在增加执行时间。您可以轻松地查看peter norvig的代码,点击此处

我已经找到了问题。这是在声明中

  candidates =(已知([word])。union(已知(edits1(word)))
$ b $。$ b $。$ b $。$ b $。$ $ b $。 p>其中,

  def known_edits3(word):
return set(e3 for ed1 in(word)for e2 in edits1(e1)
for ed3中的e3(e2)if e3 in NWORDS)

可以看到 edits3 中有3个for循环,执行时间增加3倍。 edits2 有2个for循环。所以这是罪魁祸首。



如何最小化这个表达式?
可以帮助 itertools.repeat 帮助您解决这个问题。

解决方案




  1. 使用列表理解(或生成器)
  2. 不要在每次迭代中计算相同的东西

代码将会减少到:

  def correct(word):
candidates =(known([word])。union(known(edits1(word))))。union(known_edits2 (word).union(known_edits3(word))或[word])

#在循环外计算soundex
soundex_word = soundex(word)

#List compre
candidate_new = [候选人候选人,如果soundex(候选人)== soundex_word]

#或发电机。这将节省内存
candidate_new =候选人候选人(如果候选人soundex(候选人)== soundex_word)

返回最大(candidate_new,键=(NWORDS.get))

另一个增强功能是基于您只需要MAX候选人的事实

  def correct(word):
candidates =(known([word])。union(known(edits1(word))))。union(known_edits2 ).union(known_edits3(word))or [word])

soundex_word = soundex(word)
max_candidate = None
max_nword = 0

如果soundex(候选人)== soundex_word和NWORDS.get(候选人)> max_nword:
max_candidate = candidate
return max_candidate


for loops are quite expensive when it comes to execution time. I am building a correction algorithm and I've used peter norvig's code of spell correction . I modified it a bit and realized it is taking too long to execute the optimization on thousands of words.

The algorithm checks for 1 and 2 edit distance and corrects it. I've made it 3 . So that might increase the time (I am not sure). Here is a part of the end where the highest occurring words are used as reference:

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word]) # this is where the problem is

    candidate_new = []
    for candidate in candidates: #this statement isnt the problem
        if soundex(candidate) == soundex(word):
            candidate_new.append(candidate)
    return max(candidate_new, key=(NWORDS.get))

And it looks like the statement for candidate in candidates is increasing the execution time. You could easily have a look at the code of peter norvig, Click here.
I've figured out the problem. It's in the statement

candidates = (known([word]).union(known(edits1(word)))
             ).union(known_edits2(word).union(known_edits3(word)) or [word])

where ,

def known_edits3(word):
    return set(e3 for e1 in edits1(word) for e2 in edits1(e1) 
                                      for e3 in edits1(e2) if e3 in NWORDS)  

It can be seen that there are 3 for loops inside edits3 which increases execution time 3 fold. edits2 has 2 for loops . so this is the culprit.

How do I minimize this expression? Could itertools.repeat help out with this one??

解决方案

A couple of ways to increase performance here:

  1. Use list comprehension (or generator)
  2. Don't compute the same thing in each iteration

The code would reduce to:

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])

    # Compute soundex outside the loop
    soundex_word = soundex(word)

    # List compre
    candidate_new = [candidate for candidate in candidates if soundex(candidate) == soundex_word]

    # Or Generator. This will save memory
    candidate_new = (candidate for candidate in candidates if soundex(candidate) == soundex_word)

    return max(candidate_new, key=(NWORDS.get))

Another enhancement is based on the fact that you need only the MAX candidate

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])

    soundex_word = soundex(word)
    max_candidate = None
    max_nword = 0
    for candidate in candidates:
        if soundex(candidate) == soundex_word and NWORDS.get(candidate) > max_nword:
            max_candidate = candidate
    return max_candidate

这篇关于加速执行,Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆