用于管理字符串文字等项目的转义字符的正则表达式 [英] Regex for managing escaped characters for items like string literals

查看:25
本文介绍了用于管理字符串文字等项目的转义字符的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够将字符串文字与转义引号选项进行匹配.例如,我希望能够搜索这是一个'带有转义\'值的测试' ok"并让它正确地将反斜杠识别为转义字符.我尝试过如下解决方案:

I would like to be able to match a string literal with the option of escaped quotations. For instance, I'd like to be able to search "this is a 'test with escaped\' values' ok" and have it properly recognize the backslash as an escape character. I've tried solutions like the following:

import re
regexc = re.compile(r"\'(.*?)(?<!\\)\'")
match = regexc.search(r""" Example: 'Foo \' Bar'  End. """)
print match.groups() 
# I want ("Foo \' Bar") to be printed above

看了这个,有一个简单的问题,就是使用的转义字符\"本身无法转义.我不知道该怎么做.我想要一个像下面这样的解决方案,但否定的后视断言需要固定长度:

After looking at this, there is a simple problem that the escape character being used, "\", can't be escaped itself. I can't figure out how to do that. I wanted a solution like the following, but negative lookbehind assertions need to be fixed length:

# ...
re.compile(r"\'(.*?)(?<!\\(\\\\)*)\'")
# ...

任何正则表达式大师能够解决这个问题?谢谢.

Any regex gurus able to tackle this problem? Thanks.

推荐答案

我认为这会奏效:

import re
regexc = re.compile(r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'")

def check(test, base, target):
    match = regexc.search(base)
    assert match is not None, test+": regex didn't match for "+base
    assert match.group(1) == target, test+": "+target+" not found in "+base
    print "test %s passed"%test

check("Empty","''","")
check("single escape1", r""" Example: 'Foo \' Bar'  End. """,r"Foo \' Bar")
check("single escape2", r"""'\''""",r"\'")
check("double escape",r""" Example2: 'Foo \\' End. """,r"Foo \\")
check("First quote escaped",r"not matched\''a'","a")
check("First quote escaped beginning",r"\''a'","a")

正则表达式 r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'" 是仅前向匹配字符串中我们想要的内容:

The regular expression r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'" is forward matching only the things that we want inside the string:

  1. 不是反斜杠或引号的字符.
  2. 转义引用
  3. 转义反斜杠

在前面添加额外的正则表达式以检查第一个转义的引号.

Add extra regex at front to check for first quote escaped.

这篇关于用于管理字符串文字等项目的转义字符的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆