ParseKit - SQLite 解析器进入无限递归 [英] ParseKit - SQLite parser going into infinite recursion

查看:41
本文介绍了ParseKit - SQLite 解析器进入无限递归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的应用程序,我正在尝试构建一个 SQLite 解析器.由于我的应用程序使用 Objective-C,ParseKit 似乎是一个不错的选择.我阅读了 SQLite 的 语法图 并基于它们构建了一个语法.但是,当我尝试使用此语法解析某些内容时,解析器会进入无限递归.

For my application, I am trying to build an SQLite parser. As my application is using Objective-C, ParseKit seems like a good option. I read SQLite's syntax diagrams and built a grammar based on them. However, when I try to parse something using this grammar, the parser goes into infinite recursion.

我需要的唯一语句是 SELECT、INSERT、UPDATE 和 DELETE(我需要 SELECT 主要是因为其他人引用了它).我的@start 旨在处理以分号分隔的多个语句:

The only statements I need are SELECT, INSERT, UPDATE, and DELETE (I need SELECT mostly because the others refer to it). My @start is designed to handle multiple statements separated by semicolons:

@start = statement (';' statement)*;

statement = select_stmt | insert_stmt | update_stmt | delete_stmt ;

声明如下:

select_stmt = select_core ( compound_operator select_core )* order_expr? limit_expr? ;
select_core = 'select' ( 'distinct' | 'all' )? result_column ( ',' result_column )* from_expr where_expr group_expr ;
result_column = '*' | table_name '.' '*' | expr ( 'as'? column_alias )? ;

insert_stmt = 'insert' or_on_failure? 'into' table_name insert_expr ;
insert_expr = insert_expr_cols? (insert_expr_values | select_stmt) ;
insert_expr_cols = '(' column_name ( ',' column_name )* ')' ;
insert_expr_values = 'values' '(' expr ( ',' expr )* ')' ;
insert_expr_defaults = 'default' 'values' ;

update_stmt = 'update' or_on_failure? qualified_table_name update_expr where_expr? limited_expr? ;
update_expr = 'set' update_expr_col ( ',' update_expr_col )* ;
update_expr_col = column_name '=' expr ;

delete_stmt = 'delete' 'from' qualified_table_name where_expr? limited_expr? ;

以及他们的支持表达:

order_expr = 'order' 'by' ordering_term (',' ordering_term)* ;
limit_expr = 'limit' expr ( ( 'offset' | ',' ) expr )? ;
from_expr = 'from' join_source ;
where_expr = 'where' expr ;
group_expr = 'group' 'by' expr ( ',' expr )? ( 'having' expr )? ;

join_source = single_source ( join_operator single_source join_constraint? )* ;
single_source = table_name 'as' table_alias indexed_by? | '(' select_stmt ')' ( 'as'? table_alias )? | '(' join_source ')' ;

or_on_failure = 'or' on_failure ;
on_failure = 'rollback' | 'abort' | 'replace' | 'fail' | 'ignore' ;
limited_expr = order_expr limit_expr ;

姓名、别名:

database_name = name ;
table_name = (database_name '.')? name ;
column_name = (table_name '.')? name ;

table_alias = name ;
column_alias = name ;

index_name = name ;

type_name = name+ ( '(' number (',' number)? ')' )? ;
function_name = name ;
collation_name = name ;

qualified_table_name = table_name indexed_by? ;

杂项运算符等:

indexed_by = 'indexed' 'by' index_name | 'not' 'indexed' ;

unary_operator = symbol ;
binary_operator = symbol ;
compound_operator = 'union' 'all'? | 'intersect' | 'except' ;
join_operator = ',' | 'natural'? ( 'left' 'outer'? | 'inner' | 'cross' ) 'join' ;

join_constraint = 'on' expr | 'using' '(' column_name ( ',' column_name )* ')' ;

基本类型:

literal = number
    | string
    | 'null'
    | 'current_time'
    | 'current_data'
    | 'current_timestamp' ;
number = Number ;
string = Word
    | QuotedString ;
name = Word
    | QuotedString ;
symbol = Symbol;

还有 EXPR:

expr = literal
    | column_name
    | unary_operator expr
    | expr binary_operator expr
    | function_name '(' ( '*' | 'distinct'? expr ( ',' expr )* )? ')'
    | '(' expr ')'
    | 'cast' '(' expr 'as' type_name ')'
    | expr 'collate' collation_name
    | expr 'not'? ( 'like' | 'glob' | 'regexp' | 'match' ) expr ( 'escape' expr )?
    | expr ( 'isnull' | 'notnull' | 'not' 'null' )
    | expr 'is' 'not'? expr
    | expr 'not'? 'between' expr 'and' expr
    | expr 'not'? 'in' ( table_name | '(' ( select_stmt | expr ( ',' expr )* )? ')' )
    | ( 'not'? 'exists' )? '(' select_stmt ')'
    | 'case' expr? ( 'when' expr 'then' expr )+ ( 'else' expr )? 'end' ;

当我单步执行代码时,基本路径是@start -> statement -> select_stmt -> select_core -> result_column -> expr -> expr -> expr...

When I stepped through the code, the basic path was @start -> statement -> select_stmt -> select_core -> result_column -> expr -> expr -> expr...

在 PKParser 的 matchAndAssemble: 和 PKParser/Subclass 的 allMatchesFor: 之间进行了大约 8-9k 次调用之后,某些东西会死掉,通常是由于 EXC_BAD_ACCESS 错误(然后 LLDB 也不能做任何事情).

After about 8-9k calls between PKParser's matchAndAssemble: and PKParser/Subclass's allMatchesFor:, something dies, usually from a EXC_BAD_ACCESS error (and then LLDB can't do anything either).

P.S.:如果你要发布一个回答说,'哦,你真的应该这样做/使用 这个,'A) 我喜欢 Objective-C.不要告诉我不要使用它.是我的选择.我的回答可能是咆哮.B)我尝试挖掘 SQLite 的源代码以使用他们的解析器.我从来没有到任何地方.如果您认为我应该使用它,请将其解析器的源代码作为没有其他依赖项的单个文件发布.

P.S.: If you're going to post an answer saying, 'Oh, you should really do/use this,' A) I like Objective-C. Don't tell me not to use it. It's my choice. My response will likely be a rant. And B) I tried digging through SQLite's source to use their parser. I never got anywhere. If you think I should use that, kindly post the source for their parser as a single file with no other dependencies.

推荐答案

ParseKit 的开发者在这里.

首先,请参阅我之前关于 调试 ParseKit 语法 的回答和在 ParseKit 语法中与无限递归作斗争.

First, see my previous answers on debugging ParseKit grammars and battling infinite recursion in ParseKit grammars.

我认为第一行可能有问题(但我不是 SQL 专家,所以我不确定).不应该是:

I think there might be an issue in the very first line (but I'm not a SQL expert, so I'm not sure). Shouldn't that be:

@start = (statement ';')+;

<小时>

我强烈建议使用驼峰式大小写而不是下划线,因为下划线会使您的 Objective-C 回调变得非常笨拙和丑陋.这就是为什么骆驼大小写是 ParseKit 语法中的约定.


I would strongly recommend using camel case instead of underscores, as underscores will make your Objective-C callbacks very awkward and ugly. That is why camel case is the convention in ParseKit grammars.

但是,我看到了主要问题.您的语法包括 Left Recursion 这在 ParseKit 语法中是不允许的.特别是在您的 expr 产品中(我没有仔细查看它是否也在其他地方).

However, I see the main issue. Your grammar includes Left Recursion which is not allowed in ParseKit grammars. Particularly in your expr prodcution (I haven't looked closely to see if it is elsewhere too).

ParseKit 适合递归,但不适合递归.对此的最佳解释是 Steven Metsker 的Building Parsers with Java".或者 搜索网络.

ParseKit is fine with recursion, but not left recusion. The best explanation of this is in Steven Metsker's "Building Parsers with Java". Or search the web.

但基本上左递归是指产生式立即引用自身(在表达式的左侧):

But basically left recursion is when a production immediately references itself (on the left side of an expression):

e = e '-' Number;

e = Number | e '-' Number;

相反,您必须设计语法以消除左递归,例如:

Instead, you must design your grammars to remove left recursion like:

e = Number ('-' Number)*;

这篇关于ParseKit - SQLite 解析器进入无限递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆