Python重新无限执行 [英] Python re infinite execution

查看:49
本文介绍了Python重新无限执行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试执行此代码:

I'm trying to execute this code :

import re
pattern = r"(\w+)\*([\w\s]+)*/$"
re_compiled = re.compile(pattern)
results = re_compiled.search('COPRO*HORIZON 2000                 HOR')
print(results.groups())

但是 Python 没有响应.该进程占用 100% 的 CPU 并且不会停止.我已经在 Python 2.7.1 和 Python 3.2 上尝试过,结果相同.

But Python does not respond. The process takes 100% of the CPU and does not stop. I've tried this both on Python 2.7.1 and Python 3.2 with identical results.

推荐答案

您的正则表达式遇到 灾难性回溯 因为你有嵌套的量词 (([...]+)*).由于您的正则表达式要求字符串以 / 结尾(这在您的示例中失败),因此正则表达式引擎会尝试字符串的所有排列,徒劳地希望找到匹配的组合.这就是它卡住的地方.

Your regex runs into catastrophic backtracking because you have nested quantifiers (([...]+)*). Since your regex requires the string to end in / (which fails on your example), the regex engine tries all permutations of the string in the vain hope to find a matching combination. That's where it gets stuck.

为了说明,让我们假设 "A*BCD" 作为正则表达式的输入,看看会发生什么:

To illustrate, let's assume "A*BCD" as the input to your regex and see what happens:

  1. (\w+) 匹配 A.很好.
  2. \* 匹配 *.是的.
  3. [\w\s]+ 匹配 BCD.好的.
  4. / 匹配失败(没有剩余的字符可以匹配).好的,让我们备份一个字符.
  5. / 无法匹配 D.哼.让我们再备份一些.
  6. [\w\s]+ 匹配 BC,重复的 [\w\s]+ 匹配 D.
  7. / 匹配失败.备份.
  8. / 无法匹配 D.再备份一些.
  9. [\w\s]+ 匹配 B,重复的 [\w\s]+ 匹配 CD.
  10. / 匹配失败.再次备份.
  11. / 无法匹配 D.再备份一些.
  12. 怎么样[\w\s]+匹配B,重复[\w\s]+匹配C,重复的 [\w\s]+ 匹配 D?不?让我们试试别的.
  13. [\w\s]+ 匹配 BC.让我们停下来看看会发生什么.
  14. 该死,/ 仍然不匹配 D.
  15. [\w\s]+ 匹配 B.
  16. 仍然没有运气./C 不匹配.
  17. 嘿,整个组都是可选的(...)*.
  18. 不,/ 仍然不匹配 B.
  19. 好吧,我放弃了.
  1. (\w+) matches A. Good.
  2. \* matches *. Yay.
  3. [\w\s]+ matches BCD. OK.
  4. / fails to match (no characters left to match). OK, let's back up one character.
  5. / fails to match D. Hum. Let's back up some more.
  6. [\w\s]+ matches BC, and the repeated [\w\s]+ matches D.
  7. / fails to match. Back up.
  8. / fails to match D. Back up some more.
  9. [\w\s]+ matches B, and the repeated [\w\s]+ matches CD.
  10. / fails to match. Back up again.
  11. / fails to match D. Back up some more, again.
  12. How about [\w\s]+ matches B, repeated [\w\s]+ matches C, repeated [\w\s]+ matches D? No? Let's try something else.
  13. [\w\s]+ matches BC. Let's stop here and see what happens.
  14. Darn, / still doesn't match D.
  15. [\w\s]+ matches B.
  16. Still no luck. / doesn't match C.
  17. Hey, the whole group is optional (...)*.
  18. Nope, / still doesn't match B.
  19. OK, I give up.

现在这是一个只有三个字母的字符串.你的有大约 30 个,尝试所有排列会让你的计算机一直忙到几天结束.

Now that was a string of just three letters. Yours had about 30, trying all permutations of which would keep your computer busy until the end of days.

我想你想要做的是在 * 之前/之后获取字符串,在这种情况下,使用

I suppose what you're trying to do is to get the strings before/after *, in which case, use

pattern = r"(\w+)\*([\w\s]+)$"

这篇关于Python重新无限执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆