正则表达式,用于捕获重复模式 [英] RegEx for capturing a repeating pattern

查看:96
本文介绍了正则表达式,用于捕获重复模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从具有重复模式的正则表达式捕获中获得了以下正则表达式

([0-9]{1,2}h)[ ]*([0-9]{1,2}min):[ ]*(.*(?:\n(?![0-9]{1,2}h).*)*)

它需要以下字符串

1h 30min: Title 
- Description Line 1
1h 30min: Title
- Description Line 1
- Description Line 2
- Description Line 3

并产生此结果

Match 1:
  "1h 30min: Title 
  - Description Line 1"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title 
               - Description Line 1"

Match 2:
  "1h 30min: Title 
 - Description Line 1
 - Description Line 2
 - Description Line 3"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title 
               - Description Line 1
               - Description Line 2
               - Description Line 3"


我现在有匹配的1h 30min并不总是出现在新行上.所以说我有以下字符串


I now have the matching 1h 30min not always occur on a new line. So say I hade the following string

1h 30min: Title 
- Description Line 1 1h 30min: Title - Description Line 1
- Description Line 2
- Description Line 3

如何修改正则表达式以获取以下匹配结果?

How can I modify the regex to get the following matched result?

Match 1:
  "1h 30min: Title 
  - Description Line 1"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title 
               - Description Line 1"

Match 2:
  "1h 30min: Title - Description Line 1
 - Description Line 2
 - Description Line 3"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title - Description Line 1
               - Description Line 2
               - Description Line 3"

我虽然删除了\n可以解决问题,但是最终只匹配第一个1h 30min

I though removing the \n would do the trick but it just ends up matching everything after the first 1h 30min

推荐答案

您只需做很小的改动就可以使它起作用,但问题是最后一部分. 脾气贪婪的令牌的一般形式是:

You can make this work with only minor changes, but the issue is that last part. The general form of a tempered greedy token is this:

(.(?!notAllowed))+

因此,针对您的情况使用此模式,并添加命名组以使内容更清晰:

so, using this pattern for your case, plus adding named groups for clarity:

(?<hours>[0-9]{1,2}h)[ ]*(?<minutes>[0-9]{1,2}min):\s*(?<description>(?:.(?!\dh\s\d{1,2}min))+)

PS:如果您无法打开点匹配换行符"模式,则可以

PS: if you cannot turn on a "dot matches newline" mode, you may be able to use [\s\S] to simulate.

regex101演示

这篇关于正则表达式,用于捕获重复模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆