非固定长度的正则表达式负向后视 [英] Regular expression negative lookbehind of non-fixed length

查看:43
本文介绍了非固定长度的正则表达式负向后视的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如文档:

这被称为否定回顾断言.与肯定的后视断言类似,包含的模式必须只匹配某些固定长度的字符串.

This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length.

所以这会起作用,目的是匹配 {} 之外的任何 ,,而不是 {} 内的:

So this will work, the intention is to match any , outside {}, but not inside {}:

In [188]:

re.compile("(?<!\{)\,.").findall('a1,a2,a3,a4,{,a6}')
Out[188]:
[',a', ',a', ',a', ',{']

这将适用于略有不同的查询:

this will work, on a slightly different query:

In [189]:

re.compile("(?<!\{a5)\,.").findall('a1,a2,a3,a4,{a5,a6}')
#or this: re.compile("(?<!\{..)\,.").findall('a1,a2,a3,a4,{a5,a6}')
Out[189]:
[',a', ',a', ',a', ',{']
In [190]:

但如果查询是 'a1,a2,a3,a4,{_some_length_not_known_in_advance,a6}',根据文档,以下将无法按预期工作:

But if the query is 'a1,a2,a3,a4,{_some_length_not_known_in_advance,a6}', according to the document the following won't work as intended:

In [190]:

re.compile("(?<![\{.*])\,.").findall('a1,a2,a3,a4,{a5,a6}')
Out[190]:
[',a', ',a', ',a', ',{', ',a']

有什么替代方法可以实现这一目标吗?负面回顾是错误的方法吗?

Any alternative to achieve this? Is negative lookbehind the wrong approach?

这就是lookbehind最初的设计方式(仅匹配某些固定长度的字符串)的任何原因?

Any reason this is how lookbehind was designed to do (only match strings of some fixed length) in the first place?

推荐答案

有什么替代方法可以实现这一目标吗?

Any alternative to achieve this?

是的.有一个非常简单的技术,这种情况非常类似于regex-match a pattern until..."

Yes. There is a a brilliantly simple technique, and this situation is very similar to "regex-match a pattern unless..."

这是您的简单正则表达式:

Here's your simple regex:

{[^}]*}|(,)

交替的左侧 | 匹配完整的 { 括号 } 标签.我们将忽略这些匹配.右侧匹配并捕获到 Group 1 的逗号,我们知道它们是正确的逗号,因为它们与左侧的表达式不匹配.

The left side of the alternation | matches complete { brackets } tags. We will ignore these matches. The right side matches and captures commas to Group 1, and we know they are the right commas because they were not matched by the expression on the left.

这是一个执行多个任务的演示,因此您可以挑选(请参阅演示底部的输出):

Here is a demo that performs several tasks, so you can pick and choose (see the output at the bottom of the demo):

  1. 计算要匹配的逗号(而不是大括号之间的逗号)
  2. 显示匹配项(逗号...废话)
  3. 替换正确的逗号.这里我们替换为 SplitHere 以便我们可以执行任务 4...
  4. 在逗号上拆分,并显示拆分的字符串
  1. Count the commas you want to match (not those between braces)
  2. Show the matches (commas... duh)
  3. Replace the right commas. Here we replace with SplitHere so we can perform task 4...
  4. Split on the commas, and display the split strings

参考

如何匹配(或替换)模式,除了 s1、s2、s3...

这篇关于非固定长度的正则表达式负向后视的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆