正则表达式排除表达式中的字符串和属性 [英] Regex to exclude string and attribute within expression
问题描述
我有一个正则表达式,可以将 {{expression}}
转换为 {%print expression%}
是 {{function()}}
或 {{object.function()}}
或 {{a + b}}
,但在获得 {{var}}
或 {{object.attribute}}
。
I have a regex which will convert {{ expression }}
into {% print expression %}
when expression is {{ function() }}
or {{ object.function() }}
or arithmetic operation like {{ a+b }}
but will not convert when it will get {{ var }}
or {{ object.attribute }}
.
正则表达式存在的问题是它会转换字符串表达式 {{ string}}
或 {{ function()}}
或 {{ {{var}} }}
放入 {%print string%}
或 {%print function()%}
或 {%print {%print var%}%}
The issue with regex I have is it convert string expression {{ "string" }}
or {{ "function()" }}
or {{ "{{ var}}" }}
into {% print "string" %}
or {% print "function()" %}
or {% print "{% print var %}" %}
import re
def replacement(val):
content = val.group(1)
if re.match('^\s*[\w\.]+\s*$', content):
return "{{%s}}" % content
else:
return "{%% print %s %%}" % content
string_obj = """{{ var }} {{ object.var }} {{ func()}} {{ object.function() }} {{ a+b }} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}"""
print(re.sub("{{(\s*.*?\s*)}}", replacement, string_obj))
输出:
'{{var}} {{object.var} } {%print func()%} {%print
object.function()%} {%print a + b%} {%print string%} {%print {{var}}% } {%print
function()%} {%print a + b%}'
'{{ var }} {{ object.var }} {%print func()%} {% print object.function() %} {% print a+b %} {% print "string" %} {% print "{{ var }}" %} {% print "function()" %} {% print "a+b" %}'
输出我想要的是:
'{{var}} {{object.var}} {%print func()%} { %print
object.function()%} {%print a + b%} {{ string}} {{ {{var}}}} {{ function()
}} {{ a + b}}'
'{{ var }} {{ object.var }} {%print func()%} {% print object.function() %} {% print a+b %} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}'
注意:,这里的一个条件是表达式在 {{}}
之间可以使用 {{}}
这样的字符串表达式,即用双引号或 {{''string'}}
即单引号。
Note: The one condition here is expression in between {{ }}
can have string expression like {{ "string" }}
i.e. with double quotes or {{ 'string' }}
i.e. with single quotes.
推荐答案
代码
对于更漂亮的打印,我只需在开头和结尾处去除空格。
Code
For prettier printing I just strip the whitespace at beginning and end. It just simplifies the regex, too.
import re
def replacement(val):
content = val.group(1).strip()
if re.match('^\w[^\.\(\+\*\/\-\|]*\.?\w[^\.\(\+\*\/\-\|]*$', content):
return "{{ %s }}" % content
else:
return "{%% print %s %%}" % content
def maskString(templateString):
stringChars = ['"', "'"]
a = 0
start = None
maskedList = []
while a < len(templateString):
l = templateString[a]
if l in stringChars and start is None and a-1 >=0 and templateString[a-1] != '\\':
start = {'l' : l, 's' : a}
elif start is not None and l is start['l'] and a-1 >=0 and templateString[a-1] != '\\':
start['e'] = a + 1
stringToMask = templateString[start['s']:start['e']]
templateString = templateString[:start['s']] + ("_" * len(stringToMask)) + templateString[start['e']:]
maskedList.append(stringToMask)
start = None
a += 1
return (templateString, maskedList)
def unmaskString(templateString, maskedList):
for string in maskedList:
templateString = templateString.replace("_" * len(string), string,1)
return templateString
def templateMatcher(templateString):
p = re.compile('("[^"]*)"')
templateString, maskedList = maskString(templateString)
templateString = re.sub("{{(\s*.*?\s*)}}", replacement, templateString)
return unmaskString(templateString, maskedList)
string_obj = """{{ var }} {{ object.var }} {{ func()}} {{ object.function() }} {{ a+b }} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}"""
string_obj_2 = """{{ a+b*c-d/100}} {{ 1 * 2 }} {{ 20/10 }} {{ 5-4 }}"""
string_obj_3 = """{{ "another {{ mask" }} {{ func() }}, {{ a+b }} , {{ "string with \\""|filter }}"""
print(templateMatcher(string_obj))
print(templateMatcher(string_obj_2))
print(templateMatcher(string_obj_3))
为字符串添加了高级屏蔽,因此 \假设变量永远不能仅由
_ $组成,那么
和''
将被识别为字符串。 c $ c>。字符串的开头和结尾字符位于变量 stringChars
中。因此,如果您不喜欢'
,只需将其从此处删除。
Added an advanced masking for the strings so "\""
and '"'
will be recognized as string, assuming that a variable could never consists only of _
. Strings start and endcharacter are in the variable stringChars
. So if you don't like the '
just remove it from there.
{{ var }} {{ object.var }} {% print func() %} {% print object.function() %} {% print a+b %} {{ "string" }} {{ "{{ var }}" }} {{ "function()" }} {{ "a+b" }}
{% print a+b*c-d/100 %} {% print 1 * 2 %} {% print 20/10 %} {% print 5-4 %}
{{ "another {{ mask" }} {% print func() %}, {% print a+b %} , {% print "string with \""|filter %}
这篇关于正则表达式排除表达式中的字符串和属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!