如何在 Python 中将长正则表达式规则拆分为多行 [英] How to split long regular expression rules to multiple lines in Python

查看:40
本文介绍了如何在 Python 中将长正则表达式规则拆分为多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这真的可行吗?我有一些很难理解的很长的正则表达式模式规则,因为它们不能立即适应屏幕.示例:

Is this actually doable? I have some very long regex pattern rules that are hard to understand because they don't fit into the screen at once. Example:

test = re.compile('(?P<full_path>.+):\d+:\s+warning:\s+Member\s+(?P<member_name>.+)\s+\((?P<member_type>%s)\) of (class|group|namespace)\s+(?P<class_name>.+)\s+is not documented' % (self.__MEMBER_TYPES), re.IGNORECASE)

反斜杠或三引号不起作用.

Backslash or triple quotes won't work.

编辑.我结束了使用 VERBOSE 模式.下面是正则表达式模式现在的样子:

EDIT. I ended using the VERBOSE mode. Here's how the regexp pattern looks now:

test = re.compile('''
  (?P<full_path>                                  # Capture a group called full_path
    .+                                            #   It consists of one more characters of any type
  )                                               # Group ends                      
  :                                               # A literal colon
  \d+                                             # One or more numbers (line number)
  :                                               # A literal colon
  \s+warning:\s+parameters\sof\smember\s+         # An almost static string
  (?P<member_name>                                # Capture a group called member_name
    [                                             #   
      ^:                                          #   Match anything but a colon (so finding a colon ends group)
    ]+                                            #   Match one or more characters
   )                                              # Group ends
   (                                              # Start an unnamed group 
     ::                                           #   Two literal colons
     (?P<function_name>                           #   Start another group called function_name
       \w+                                        #     It consists on one or more alphanumeric characters
     )                                            #   End group
   )*                                             # This group is entirely optional and does not apply to C
   \s+are\snot\s\(all\)\sdocumented''',           # And line ends with an almost static string
   re.IGNORECASE|re.VERBOSE)                      # Let's not worry about case, because it seems to differ between Doxygen versions

推荐答案

您可以通过引用每个段来拆分正则表达式模式.不需要反斜杠.

You can split your regex pattern by quoting each segment. No backslashes needed.

test = re.compile(('(?P<full_path>.+):\d+:\s+warning:\s+Member'
                   '\s+(?P<member_name>.+)\s+\((?P<member_type>%s)\) '
                   'of (class|group|namespace)\s+(?P<class_name>.+)'
                   '\s+is not documented') % (self.__MEMBER_TYPES), re.IGNORECASE)

您也可以使用原始字符串标志 'r' 并且您必须将它放在每个段之前.

You can also use the raw string flag 'r' and you'll have to put it before each segment.

请参阅文档.

这篇关于如何在 Python 中将长正则表达式规则拆分为多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆