处理不同的转义序列? [英] Handling different escaping sequences?
问题描述
我使用 ANTLR 和 Presto 语法来解析 SQL 查询.这是我用来解析查询的原始字符串定义:
I'm using ANTLR with Presto grammar in order to parse SQL queries. This is the original string definition I've used to parse queries:
STRING
: '\'' ( '\\' .
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
这适用于大多数查询,直到我看到具有不同转义规则的查询.例如:
This worked ok for most queries until I saw queries with different escaping rules. For example:
select
table1(replace(replace(some_col,'\\'',''),'\"' ,'')) as features
from table1
所以我修改了我的字符串定义,现在它看起来像:
So I've modified my String definition and now it looks like:
STRING
: '\'' ( '\\' .
| '\\\\' . {HelperUtils.isNeedSpecialEscaping(this)}? // match \ followed by any char
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
但是,这不适用于我得到的上述查询
However, this won't work for the query mentioned above as I'm getting
'\\'',''),'
作为单个字符串.对于以下查询,谓词返回 True.知道我该如何处理这个查询吗?
as a single string. The predicate returns True for the following query. Any idea how can I handle this query as well?
谢谢,尼尔.
推荐答案
最后我解决了.这是我使用的表达方式:
In the end I was able to solve it. This is the expression I was using:
STRING
: '\'' ( '\\\\' . {HelperUtils.isNeedSpecialEscaping(this)}?
| '\\' (~[\\] | . {!HelperUtils.isNeedSpecialEscaping(this)}?)
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
这篇关于处理不同的转义序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!