是什么使Java更容易比C解析? [英] What makes Java easier to parse than C?

查看:108
本文介绍了是什么使Java更容易比C解析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我与 C和C的语法++是事实认识上下文敏感,特别是你需要一个词法黑客在C.在另一方面,我是IM pression,你可以只用2令牌前瞻的解析Java的下尽管两种语言之间的相似性相当

I'm acquainted with the fact that the grammars of C and C++ are context-sensitive, and in particular you need a "lexer hack" in C. On the other hand, I'm under the impression that you can parse Java with only 2 tokens of look-ahead, despite considerable similarity between the two languages.

什么你就必须改变一下C,使其更容易处理的解析?

我问,因为所有我见过C'S上下文相关的例子是技术上允许的,但非常怪异。例如,

I ask because all of the examples I've seen of C's context-sensitivity are technically allowable but awfully weird. For example,

foo (a);

可以调用void函数与参数 A 。或者,它可以声明 A 是类型的对象,但你可以很容易地摆脱圆括号中。在某种程度上,这怪事发生的原因为Ç语法满足声明两个函数和变量的双重目的。

could be calling the void function foo with argument a. Or, it could be declaring a to be an object of type foo, but you could just as easily get rid of the parantheses. In part, this weirdness occurs because the "direct declarator" production rule for the C grammar fulfills the dual purpose of declaring both functions and variables.

在另一方面,在 Java的语法有变量声明和功能独立的生产规则宣言。如果你写

On the other hand, the Java grammar has separate production rules for variable declaration and function declaration. If you write

foo a;

那么你知道它是一个变量的声明和可以明确被解析为一个类型名称。这可能不是有效的code如果类尚未某处在当前范围内定义的,不过这是可以在一个进行语义分析工作后来编译通过。

then you know it's a variable declaration and foo can unambiguously be parsed as a typename. This might not be valid code if the class foo hasn't been defined somewhere in the current scope, but that's a job for semantic analysis that can be performed in a later compiler pass.

我看到有人说,C是困难的,因为类型定义的解析,但你可以在Java中也声明自己的类型。其中Ç语法规则,除了 direct_declarator ,都有过错?

I've seen it said that C is hard to parse because of typedef, but you can declare your own types in Java too. Which C grammar rules, besides direct_declarator, are at fault?

推荐答案

解析C ++越来越难。解析Java是越来越成为一样硬。

请参阅此<一个href=\"http://stackoverflow.com/questions/243383/why-cant-c-be-parsed-with-a-lr1-parser/1004737#1004737\">SO回答讨论为什么C(和C ++)是硬来解析。简短的总结是,C和C ++的语法的都是模棱两可的;他们会给你多解析你的必须的使用上下文来解决歧义。人们然后做出假设你有你解析解决含糊不清的错误;并非如此,见下文。如果你坚持在解决含糊不清你解析,你的分析器变得更复杂,更难以建立;但这种复杂性是一个自我造成伤口

See this SO answer discussing why C (and C++) is "hard" to parse. The short summary is that C and C++ grammars are inherently ambiguous; they will give you multiple parses and you must use context to resolve the ambiguities. People then make the mistake of assuming you have to resolve ambiguities as you parse; not so, see below. If you insist on resolving ambiguities as you parse, your parser gets more complicated and that much harder to build; but that complexity is a self-inflicted wound.

IIRC,Java 1.4中的明显的LALR(1)语法是也不含糊,所以它是易进行解析。我不太相信现代Java已经不是至少长距离局部含糊了;总是有决定是否... >>封住两个模板,或者是右移位运算符的问题。我怀疑现代Java不与LALR(1)不再解析。

IIRC, Java 1.4's "obvious" LALR(1) grammar was not ambiguous, so it was "easy" to parse. I'm not so sure that modern Java hasn't got at least long distance local ambiguities; there's always the problem of deciding whether "...>>" closes off two templates or is a "right shift operator". I suspect modern Java does not parse with LALR(1) anymore.

但是有可以使用强大的解析器(或弱解析器和上下文收集黑客C和C ++前端现在大多这样做),两种语言让过去的解析问题。
C和C ++具有包括preprocessor的额外的复杂;这些都是在实践中复杂得多,他们看。一个说法是,C和C ++解析器相当硬,因此必须由手工编写。 这是不正确的;你可以建立Java和C ++解析器只是GLR分析器生成的罚款。

But one can get past the parsing problem by using strong parsers (or weak parsers and context collection hacks as C and C++ front ends mostly do now), for both languages. C and C++ have the additional complication of having a preprocessor; these are more complicated in practice than they look. One claim is that the C and C++ parsers are so hard they have to be be written by hand. It isn't true; you can build Java and C++ parsers just fine with GLR parser generators.

但是解析并不是真正的问题所在。

一旦你分析,你会想要做的AST /分析树什么的。在实践中,你需要知道,每一个标识,它是什么定义,并在那里被使用(名称和类型解析,拖泥带水,建筑符号表)。这原来是比获得解析器权更多的工作,通过继承,接口,超载和模板,并通过这一切的语义横跨几十写在非正式的自然语言小号$ P $垫的事实,混淆加剧上百的语言标准页。 C ++实在是太差了这里。的Java 7和8越来越成为pretty从这个角度来看可怕。 (和符号表不是你所需要的,看到我的生物对一个较长的散文生命解析后)

Once you parse, you will want to do something with the AST/parse tree. In practice, you need to know, for every identifier, what its definition is and where it is used ("name and type resolution", sloppily, building symbol tables). This turns out to be a LOT more work than getting the parser right, compounded by inheritance, interfaces, overloading and templates, and the confounded by the fact that the semantics for all this is written in informal natural language spread across tens to hundreds of pages of the language standard. C++ is really bad here. Java 7 and 8 are getting to be pretty awful from this point of view. (And symbol tables aren't all you need; see my bio for a longer essay on "Life After Parsing").

对于大多数人来说与纯解析斗争的一部分(通常永不完;检查SO本身的很多很多关于如何建立真正的汉语语言解析器工作的问题),所以他们没有看到过分析后的生活。然后,我们得到什么是难以解析的民间定理,也没有关于这一阶段之后会发生什么信号。

Most folks struggle with the pure parsing part (often never finishing; check SO itself for the many, many questions about to how to build working parsers for real langauges), so they don't ever see life after parsing. And then we get folk theorems about what is hard to parse and no signal about what happens after that stage.

修复C ++语法不会得到你的任何地方。

关于改变C ++语法:你会发现你需要修补了很多地方采取的各种任何C ++语法地方和真正的模棱两可的照顾。如果你坚持,在下面的列表可能是一个很好的起点。我主张没有一点做这个,如果你不是C ++标准委员会;如果你这样做了,并且使用内置编译器,没有人会理智的使用它。有一个在现有的C ++应用程序切换的家伙建筑解析器方便太多投入;除此之外,他们的痛苦已经过去,现有的解析器正常工作。

Regarding changing the C++ syntax: you'll find you need to patch a lot of places to take care of the variety of local and real ambiguities in any C++ grammar. If you insist, the following list might be a good starting place. I contend there is no point in doing this if you are not the C++ standards committee; if you did so, and built a compiler using that, nobody sane would use it. There's too much invested in existing C++ applications to switch for convenience of the guys building parsers; besides, their pain is over and existing parsers work fine.

您可能需要编写自己的解析器。好没关系;只是不要指望社会的其余部分,让你改变他们必须使用,使您更方便的语言。他们都希望他们更容易,这就是使用语言记录并实施。

You may want to write your own parser. OK, that's fine; just don't expect the rest of the community to let you change the language they must use to make it easier for you. They all want it easier for them, and that's to use the language as documented and implemented.

这篇关于是什么使Java更容易比C解析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆