python正则表达式向前看正负 [英] python regex look ahead positive + negative

查看:264
本文介绍了python正则表达式向前看正负的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此正则表达式将得到456.我的问题是为什么它不能从1-234-56变为234? 56是否限定(?!\ d))模式,因为它不是一位数字. (?!\ d))寻找的起点在哪里?

This regex will get 456. My question is why it CANNOT be 234 from 1-234-56 ? Does 56 qualify the (?!\d)) pattern since it is NOT a single digit. Where is the beginning point that (?!\d)) will look for?

import re
pattern = re.compile(r'\d{1,3}(?=(\d{3})+(?!\d))')
a = pattern.findall("The number is: 123456") ; print(a)

在第一阶段添加逗号分隔符,例如123,456.

It is in the first stage to add the comma separator like 123,456.

a = pattern.findall("The number is: 123456") ; print(a)
results = pattern.finditer('123456')
  for result in results:
    print ( result.start(), result.end(), result)

推荐答案

我的问题是为什么它不能从1-234-56变为234?

这是不可能的,因为(?=(\d{3})+(?!\d))要求1位数至3位数的序列后出现3位数的序列. 56(您想象中的场景中的最后一个数字组)是一个2位数的组.由于量词可以是懒惰的,也可以是贪婪的,因此您不能同时将一个,两个和三个数字组与\d{1,3}进行匹配.要从123456获得234,您需要为其专门定制的正则表达式: \B\d{3} (?<=1)\d{3} 甚至是

It is not possible as (?=(\d{3})+(?!\d)) requires 3-digit sequences appear after a 1-3-digit sequence. 56 (the last digit group in your imagined scenario) is a 2-digit group. Since a quantifier can be either lazy or greedy, you cannot match both one, two and three digit groups with \d{1,3}. To get 234 from 123456, you'd need a specifically tailored regex for it: \B\d{3}, or (?<=1)\d{3} or even \d{3}(?=\d{2}(?!\d)).

56是否匹配(?!\d))模式? (?!\ d))寻找的起点在哪里?

Does 56 match the (?!\d)) pattern? Where is the beginning point that (?!\d)) will look for?

否,这是一个否定的超前查询,它不匹配,它仅检查输入字符串中当前位置之后是否没有数字.如果有数字,则匹配失败(找不到并返回结果).

No, this is a negative lookahead, it does not match, it only checks if there is no digit right after the current position in the input string. If there is a digit, the match is failed (not result found and returned).

关于前瞻的更多说明:它位于(\d{3})+子模式之后,因此正则表达式引擎会在最后一个3位数字组之后立即开始搜索一个数字,如果找到该数字,则匹配失败(因为它是负面的前瞻).用简单的话来说, (?!\d)是此正则表达式中的数字闭合/跟踪边界.

More clarification on the look-ahead: it is located after (\d{3})+ subpattern, thus the regex engine starts searching for a digit right after the last 3-digit group, and fails a match if the digit is found (as it is a negative lookahead). In plain words, the (?!\d) is a number closing/trailing boundary in this regex.

更详细的细分:

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆