加速执行,Python [英] Speed Up Execution ,Python
问题描述
该算法检查1和2编辑距离并对其进行修正。我已经做了3。所以这可能会增加时间(我不确定)。这里是最高出现的单词用作参考的一部分:
def正确(单词):$ b $ (知道([word])。union(known(edits1(word))))。union(known_edits2(word).union(known_edits3(word))或[word])#这是问题的地方
candidate_new = []
候选人候选人:#这个语句不是问题
如果soundex(候选人)== soundex(单词):
candidate_new.append(候选人)
return max(candidate_new,key =(NWORDS.get))
看起来像候选人候选人的语句正在增加执行时间。您可以轻松地查看peter norvig的代码,点击此处。
我已经找到了问题。这是在声明中
candidates =(已知([word])。union(已知(edits1(word)))
$ b $。$ b $。$ b $。$ b $。$ $ b $。 p>其中,
def known_edits3(word):
return set(e3 for ed1 in(word)for e2 in edits1(e1)
for ed3中的e3(e2)if e3 in NWORDS)
可以看到 edits3
中有3个for循环,执行时间增加3倍。 edits2
有2个for循环。所以这是罪魁祸首。
如何最小化这个表达式?
可以帮助 itertools.repeat
帮助您解决这个问题。
解决方案
- 使用列表理解(或生成器)
- 不要在每次迭代中计算相同的东西
代码将会减少到:
def correct(word):
candidates =(known([word])。union(known(edits1(word))))。union(known_edits2 (word).union(known_edits3(word))或[word])
#在循环外计算soundex
soundex_word = soundex(word)
#List compre
candidate_new = [候选人候选人,如果soundex(候选人)== soundex_word]
#或发电机。这将节省内存
candidate_new =候选人候选人(如果候选人soundex(候选人)== soundex_word)
返回最大(candidate_new,键=(NWORDS.get))
另一个增强功能是基于您只需要MAX候选人的事实
def correct(word):
candidates =(known([word])。union(known(edits1(word))))。union(known_edits2 ).union(known_edits3(word))or [word])
soundex_word = soundex(word)
max_candidate = None
max_nword = 0
:
如果soundex(候选人)== soundex_word和NWORDS.get(候选人)> max_nword:
max_candidate = candidate
return max_candidate
for
loops are quite expensive when it comes to execution time. I am building a correction algorithm and I've used peter norvig's code of spell correction . I modified it a bit and realized it is taking too long to execute the optimization on thousands of words.
The algorithm checks for 1 and 2 edit distance and corrects it. I've made it 3 . So that might increase the time (I am not sure). Here is a part of the end where the highest occurring words are used as reference:
def correct(word):
candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word]) # this is where the problem is
candidate_new = []
for candidate in candidates: #this statement isnt the problem
if soundex(candidate) == soundex(word):
candidate_new.append(candidate)
return max(candidate_new, key=(NWORDS.get))
And it looks like the statement for candidate in candidates
is increasing the execution time. You could easily have a look at the code of peter norvig, Click here.
I've figured out the problem. It's in the statement
candidates = (known([word]).union(known(edits1(word)))
).union(known_edits2(word).union(known_edits3(word)) or [word])
where ,
def known_edits3(word):
return set(e3 for e1 in edits1(word) for e2 in edits1(e1)
for e3 in edits1(e2) if e3 in NWORDS)
It can be seen that there are 3 for loops inside edits3
which increases execution time 3 fold. edits2
has 2 for loops . so this is the culprit.
How do I minimize this expression?
Could itertools.repeat
help out with this one??
解决方案 A couple of ways to increase performance here:
- Use list comprehension (or generator)
- Don't compute the same thing in each iteration
The code would reduce to:
def correct(word):
candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])
# Compute soundex outside the loop
soundex_word = soundex(word)
# List compre
candidate_new = [candidate for candidate in candidates if soundex(candidate) == soundex_word]
# Or Generator. This will save memory
candidate_new = (candidate for candidate in candidates if soundex(candidate) == soundex_word)
return max(candidate_new, key=(NWORDS.get))
Another enhancement is based on the fact that you need only the MAX candidate
def correct(word):
candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])
soundex_word = soundex(word)
max_candidate = None
max_nword = 0
for candidate in candidates:
if soundex(candidate) == soundex_word and NWORDS.get(candidate) > max_nword:
max_candidate = candidate
return max_candidate
这篇关于加速执行,Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!