ANTLRWorks 1.4.3无法正确读取扩展的ASCII字符 [英] ANTLRWorks 1.4.3 can't properly read extended-ASCII characters

查看:133
本文介绍了ANTLRWorks 1.4.3无法正确读取扩展的ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从事一个相当标准的编译器项目,为此我选择了ANTLR作为解析器生成器.在将现有语法从v2更新到v3时,我注意到ANTLRWorks(ANTLR的官方IDE)没有在文件中正确显示任何扩展的ASCII字符.即使在使用Notepad ++将文件从ASCII转换为UTF8之后,它仍然将那些字符显示为正方形.在Notepad ++中,它们显示良好.

I'm working on a fairly standard compiler project for which I picked ANTLR as the parser-generator. While updating an existing grammar from v2 to v3 I noticed that ANTLRWorks, the official IDE for ANTLR, wasn't displaying any of the extended-ASCII characters in the file properly. Even after using Notepad++ to convert the file to UTF8 from ASCII did it still display those characters as squares. In Notepad++ they display fine.

由于此故障意味着ANTLRWorks在保存文件时会损坏该文件,因此无法再将其用作编辑器,这很烦人.这里有没有其他人遇到过这个问题,也许已经解决了?非常有义务.

Since this glitch means that ANTLRWorks mauls the file when I save it I can not use it as an editor any more, which is rather annoying. Has anyone else here encountered this issue and maybe solved it? Much obliged.

[编辑]:具体问题出现在最新版本的ANTLRWorks(昨天下载)和我从

[edit]: the specific issue occurs with the latest version of ANTLRWorks (downloaded it yesterday) and with the vams.g grammar file I got from http://www.antlr.org/grammar/1086696923011/vhdlams/index.html

推荐答案

我无法在ANTLRWorks 1.4.3中重现它.

I cannot reproduce this with ANTLRWorks 1.4.3.

如果我创建了虚拟语法:

If I create a dummy grammar:

grammar T;
parse : . ;
Any   : . ;

并在多行注释中粘贴完整的扩展ASCII集:

and paste the complete extended ASCII set in a multi-line comment:

grammar T;

/*
€

‚
ƒ

...

ÿ
*/

parse : . ;
Any   : . ;

没问题.不管是使用ANTLRWorks复制字符还是使用普通编辑器复制字符,然后使用ANTLRWorks编辑现有语法,这些字符在保存到ANTLRWorks中后都保持不变.

there's no problem. It doesn't matter if I copy the chars with ANTLRWorks, or with a normal editor and then edit the existing grammar with ANTLRWorks: the characters all stay the same after saving inside ANTLRWorks.

相关说明:ANTLR 3.0至3.3版本仍与ANTLR 2.7类具有某些依赖性,这可能会导致org.antlr.Tool越过ASCII集之外的某些字符.在这种情况下,请使用ANTLR 3.4,它不再具有这些旧的依赖项.

On a related note: the versions ANTLR 3.0 to 3.3 still have some dependencies with ANTLR 2.7 classes which might cause the org.antlr.Tool to trip over certain characters outside the ASCII set. Use ANTLR 3.4 in that case, which doesn't have these old dependencies anymore.

我怀疑原始语法中某个奇数字节会引起所有混乱.我很快只复制了原始语法中的规则,将所有v2.7语法更改为v3语法(将双引号文字更改为单引号,protected变为fragment并注释了一些自定义代码)并将其保存在新文件中.可以通过ANTLRWorks或纯文本编辑器打开(保存)此文件,而不会导致扩展名ASCII字符损坏.

I suspect there's some odd byte in the original grammar somewhere that is causing all the mayhem. I quickly copied only the rules from the original grammar, changed all v2.7 syntax to v3 syntax (changed double quoted literals to single quoted ones, protected became fragment and commented some custom code) and saved it in a new file. This file could be opened (and saved) by ANTLRWorks or a plain text editor without causing it to mangle the extended ASCII chars.

以下是所述语法的ANTLR v3版本: http://pastebin.com/zU4xcvXt (语法太大而无法在SO上发布...)

Here is the ANTLR v3 version of said grammar: http://pastebin.com/zU4xcvXt (the grammar is too big to post on SO...)

语法名称对不仅仅是给它加上标签有用吗?

Is the grammar name useful for anything beyond just giving it a label?

不,不是.正如您所提到的,它仅用于为解析器或词法分析器命名.

No, it's not. It's, as you mentioned, only used to give a parser or lexer a name.

ANTLR中有4种语法:

There are 4 types of grammars in ANTLR:

  • 组合语法,看起来像grammar T;,生成TLexer.javaTParser.java源文件;
  • 解析语法,类似于parser grammar TP;,生成一个TP.java源文件;
  • lexer语法,看起来像lexer grammar TL;,生成一个TL.java源文件;
  • 树语法,类似于tree grammar TWalker,生成一个TWalker.java源文件.
  • combined grammar, which looks like grammar T;, generating TLexer.java and TParser.java source files;
  • parser grammar, looking like parser grammar TP;, generating a TP.java source file;
  • lexer grammar, looking like lexer grammar TL;, generating a TL.java source file;
  • tree grammar, looking like tree grammar TWalker, generating a TWalker.java source file.

这篇关于ANTLRWorks 1.4.3无法正确读取扩展的ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆