为什么这个正则表达式找不到结果 [英] why this regex cannot find the result

查看:49
本文介绍了为什么这个正则表达式找不到结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像下面这样的 python 代码:我的问题是为什么匹配的变量是 [' ']?(我在 regexpal.com 中使用了正则表达式,它可以找到正确的结果 |Name=A.Johnson | 在那里)

I have a python code like below: My question is why the matched variable is [' ']? (I used the regex in regexpal.com, it can find the right result |Name=A. Johnson | there)

import re
a = 
'{{Infobox U.S. Cabinet |align=left |clear=yes |Name=A. Johnson |President=Andrew Johnson |President start=1865 |President end=1869 |Vice President=None |Vice President start=1865 |Vice President end=1869 |State=[[William H. Seward]] |State start=1865 |State end=1869 |War=[[Edwin M. Stanton]] |War start=1865 |War end=1868 |War 2=[[John Schofield|John M. Schofield]] |War start 2=1868 |War end 2=1869 |Treasury=[[Hugh McCulloch]] |Treasury start=1865 |Treasury end=1869 |Justice=[[James Speed]] |Justice start=1865 |Justice end=1866 |Justice 2=[[Henry Stanberry]] |Justice start 2=1866 |Justice end 2=1868 |Justice 3=[[William M. Evarts]] |Justice start 3=1868 |Justice end 3=1869 |Post=[[William Dennison (Ohio governor)|William Dennison]] |Post start=1865 |Post end=1866 |Post 2=[[Alexander Randall|Alexander W. Randall]] |Post start 2=1866 |Post end 2=1869 |Navy=[[Gideon Welles]] |Navy start=1865 |Navy end=1869 |Interior=[[John P. Usher]] |Interior date=1865 |Interior 2=[[James Harlan (senator)|James Harlan]] |Interior start 2=1865 |Interior end 2=1866 |Interior 3=[[Orville H. Browning]] |Interior start 3=1866 |Interior end 3=1869 }}'
matched = re.findall("\|?\s*name\s*=(.)*?\|",a,re.I)

推荐答案

你需要的是 (.*?),而不是 (.)*?——后者(你所拥有的)只会捕获一个字符,即使它消耗不止一个字符.即使组本身有重复,捕获组也只会返回一次;所以后者捕获单个字符 (.) 尽管重复.

You'll want (.*?), not (.)*?—the latter (what you have) will only capture a single character, even if it consumes more than a single one. A capture group will only be returned once even if the group itself has a repeat; so the latter captures a single character (.) despite its repeat.

如果您使用 (.*?) 将重复项移到捕获组中,您将在返回中得到多个字符.

If you move the repeat into the capture group with (.*?), you'll get more than a single character in the return.

这篇关于为什么这个正则表达式找不到结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆