正则表达式排除表达式中的字符串和属性 [英] Regex to exclude string and attribute within expression

查看:106
本文介绍了正则表达式排除表达式中的字符串和属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个正则表达式,可以将 {{expression}} 转换为 {%print expression%} {{function()}} {{object.function()}} {{a + b}} ,但在获得 {{var}} {{object.attribute}} 。

I have a regex which will convert {{ expression }} into {% print expression %} when expression is {{ function() }} or {{ object.function() }} or arithmetic operation like {{ a+b }} but will not convert when it will get {{ var }} or {{ object.attribute }}.

正则表达式存在的问题是它会转换字符串表达式 {{ string}} {{ function()}} {{ {{var}} }} 放入 {%print string%} {%print function()%} {%print {%print var%}%}

The issue with regex I have is it convert string expression {{ "string" }} or {{ "function()" }} or {{ "{{ var}}" }} into {% print "string" %} or {% print "function()" %} or {% print "{% print var %}" %}

import re

def replacement(val):
    content = val.group(1)
    if re.match('^\s*[\w\.]+\s*$', content):
        return "{{%s}}" % content
    else:
        return "{%% print %s %%}" % content

string_obj = """{{ var }} {{ object.var }} {{ func()}} {{ object.function() }} {{ a+b }} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}"""

print(re.sub("{{(\s*.*?\s*)}}", replacement, string_obj))

输出:


'{{var}} {{object.var} } {%print func()%} {%print
object.function()%} {%print a + b%} {%print string%} {%print {{var}}% } {%print
function()%} {%print a + b%}'

'{{ var }} {{ object.var }} {%print func()%} {% print object.function() %} {% print a+b %} {% print "string" %} {% print "{{ var }}" %} {% print "function()" %} {% print "a+b" %}'

输出我想要的是:


'{{var}} {{object.var}} {%print func()%} { %print
object.function()%} {%print a + b%} {{ string}} {{ {{var}}}} {{ function()
}} {{ a + b}}'

'{{ var }} {{ object.var }} {%print func()%} {% print object.function() %} {% print a+b %} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}'

注意:,这里的一个条件是表达式在 {{}} 之间可以使用 {{}} 这样的字符串表达式,即用双引号或 {{''string'}} 即单引号。

Note: The one condition here is expression in between {{ }} can have string expression like {{ "string" }} i.e. with double quotes or {{ 'string' }} i.e. with single quotes.

推荐答案

代码



对于更漂亮的打印,我只需在开头和结尾处去除空格。

Code

For prettier printing I just strip the whitespace at beginning and end. It just simplifies the regex, too.

import re

def replacement(val):
    content = val.group(1).strip()
    if re.match('^\w[^\.\(\+\*\/\-\|]*\.?\w[^\.\(\+\*\/\-\|]*$', content):
        return "{{ %s }}" % content
    else:
        return "{%% print %s %%}" % content

def maskString(templateString):
    stringChars = ['"', "'"]
    a = 0
    start = None
    maskedList = []
    while a < len(templateString):
        l = templateString[a]
        if l in stringChars and start is None and a-1 >=0 and templateString[a-1] != '\\':
            start = {'l' : l, 's' : a}
        elif start is not None and l is start['l'] and a-1 >=0 and templateString[a-1] != '\\':
            start['e'] = a + 1
            stringToMask = templateString[start['s']:start['e']]
            templateString = templateString[:start['s']] + ("_" * len(stringToMask)) + templateString[start['e']:]
            maskedList.append(stringToMask)
            start = None
        a += 1
    return (templateString, maskedList)

def unmaskString(templateString, maskedList):
    for string in maskedList:
        templateString = templateString.replace("_" * len(string), string,1)
    return templateString

def templateMatcher(templateString):
    p = re.compile('("[^"]*)"')
    templateString, maskedList = maskString(templateString)
    templateString = re.sub("{{(\s*.*?\s*)}}", replacement, templateString)
    return unmaskString(templateString, maskedList)

string_obj = """{{ var }} {{ object.var }} {{ func()}} {{ object.function() }} {{ a+b }} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}"""
string_obj_2 = """{{ a+b*c-d/100}} {{ 1 * 2 }} {{ 20/10 }} {{ 5-4 }}"""
string_obj_3 = """{{ "another {{ mask" }} {{ func() }}, {{ a+b }} , {{ "string with \\""|filter }}"""

print(templateMatcher(string_obj))
print(templateMatcher(string_obj_2))
print(templateMatcher(string_obj_3))

为字符串添加了高级屏蔽,因此 \假设变量永远不能仅由 _ ''将被识别为字符串。 c $ c>。字符串的开头和结尾字符位于变量 stringChars 中。因此,如果您不喜欢',只需将其从此处删除。

Added an advanced masking for the strings so "\"" and '"' will be recognized as string, assuming that a variable could never consists only of _. Strings start and endcharacter are in the variable stringChars. So if you don't like the ' just remove it from there.

{{ var }} {{ object.var }} {% print func() %} {% print object.function() %} {% print a+b %} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}
{% print a+b*c-d/100 %} {% print 1 * 2 %} {% print 20/10 %} {% print 5-4 %}
{{ "another {{ mask" }} {% print func() %}, {% print a+b %} , {% print "string with \""|filter %}

这篇关于正则表达式排除表达式中的字符串和属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆