Python：如何为内联有序列表创建正则表达式？ [英] Python: How to create regex for inline ordered list?

查看：132 发布时间：2017/5/31 3:00:39 python regex django

本文介绍了Python：如何为内联有序列表创建正则表达式？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个表单域，大多数只包含内联有序列表：

  1。此项目可能包含字符，符号或数字。 2.此项目还...

以下代码无法用于用户输入验证（用户可以仅输入内联有序列表）：

  definiton_re = re.compile（r'^（?: \d\.\\ \\ s（？：。+？））+ $'）
 validate_definiton = RegexValidator（definiton_re，_（输入有效的定义，格式为：1.意思是＃1，2.意思是＃2 ... etc），'invalid'）

PS：这里我使用 RegexValidator 类，从Django框架验证表单字段值。

解决方案

OP的好解决方案为了进一步推动，让我们做一些正则表达式优化/高尔夫。

 （？<！\S）\d {1,2} \\（（？（？？，\s\d {1,2} \。），？[^，] *）+）
  / pre> 
 
 这是什么新的：
 
 
  
  （？：^ | \s）与回溯匹配在交替之间。这里我们使用（？<！\S）来表示我们不在非空格字符之前。
 
   \d {1,2} \.\s 不必在非捕获组内。
 
  （。+？）（？=（？：，\d {1,2} \。）| $）太笨重了。我们将此位更改为：
 
  
  （ 获取组 
 
   （？： 
 
   （？ 负面前瞻：断言我们的立场是不：
 
     ，\ s \d {1,2} \。逗号，空格字符，然后列表索引。
 
    ） 
 
    ，？[^，] * 这是一个有趣的优化： / li> 
 
  
 如果有一个，我们匹配一个逗号，因为我们从前瞻性断言知道这个位置没有启动新的列表索引，所以我们可以安全地假设非逗号序列的剩余位（如果有的话）与下一个元素无关，因此我们用 * 量词翻转它们，没有回溯。
 
 
 
 
   
 这是一个比 （。+？）。
 
 
 
 
   ）+ 继续重复该组，直到否定前瞻断言失败。
 
  ） 
 
 
 
 
 
 
 
 您可以使用它代替正则表达式在其他答案，这里是一个正则表达式演示！
 
 
 
 
 
 乍看起来，这个问题最好用 re.split（） while parsing：
  input ='1。列出项目＃1，2.列出项目2，3.列出项目＃3。'; 
 lines = re.split（'（？：^ |，）\d {1,2} \。'，input）; 
＃给出[''，'List item＃1'，'List item 2'，'List item＃3。] 
 if lines [0] ==''：
 lines = lines [1：]; 
＃将第一个空的元素从分割中删除。 
打印行; 
  
这是一个 
 
 
 p> 
 
 
  regex = re.compile（r'（？<！\S）\d {1,2} \。 \s（（？：（？！，\s\d {1,2} \。），？[^，] *）+）'）
  / pre> 
I have a form field, that most contain only inline ordered list:
1. This item may be contain characters, symbols or numbers. 2. And this item also...
The following code not working for user input validation (users can input only inline ordered list):
definiton_re = re.compile(r'^(?:\d\.\s(?:.+?))+$')
validate_definiton = RegexValidator(definiton_re, _("Enter a valid 'definition' in format: 1. meaning #1, 2. meaning #2...etc"), 'invalid')
P.S.: Here i'm using RegexValidator class from Django framework to validate form field value.
 解决方案 
Nice solution from OP. To push it further, let's do some regex optimization / golfing.
(?<!\S)\d{1,2}\.\s((?:(?!,\s\d{1,2}\.),?[^,]*)+)
Here's what's new:


(?:^|\s) Matches with backtracking between the alternation. Here we use (?<!\S) instead, to assert that we're not in front of a non-whitespace character.
\d{1,2}\.\s doesn't have to be within a non-capturing group.
(.+?)(?=(?:, \d{1,2}\.)|$) is too bulky. We change this bit to:


( Capturing group
  (?:
    (?! Negative lookahead: Assert that our position is NOT:
      ,\s\d{1,2}\. A comma, whitespace character, then a list index.
    )
    ,?[^,]* Here's the interesting optimization:

We match a comma if there is one. Because we knew from our lookahead assertion that this position does not start a new list index. Therefore, we can safely assume that the remaining bit of the non-comma sequences (if there are any) are not related to the next element, hence we roll over them with the * quantifier, and there's no backtracking.


This is a significant improvement over (.+?).

  )+ Keep repeating the group until the negative lookahead assertion fails.
)



You can use that in place of the regex in the other answer, and here's a regex demo!



Though, at first glance, this problem is better solved with re.split() while parsing:
input = '1. List item #1, 2. List item 2, 3. List item #3.';
lines = re.split('(?:^|, )\d{1,2}\. ', input);
 # Gives ['', 'List item #1', 'List item 2', 'List item #3.']
if lines[0] == '':
  lines = lines[1:];
 # Throws away the first empty element from splitting.
print lines;
Here is an online code demo.

Unfortunately, for the validation you would have to follow the regex matching approach, just compile the regex upstairs:
regex = re.compile(r'(?<!\S)\d{1,2}\.\s((?:(?!,\s\d{1,2}\.),?[^,]*)+)')


                        
这篇关于Python：如何为内联有序列表创建正则表达式？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python：如何为内联有序列表创建正则表达式？ [英] Python: How to create regex for inline ordered list?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：如何为内联有序列表创建正则表达式？ [英] Python: How to create regex for inline ordered list?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭