使用C语言的非上下文无关语言示例? [英] Examples of Non Context free language in C language?

查看:112
本文介绍了使用C语言的非上下文无关语言示例?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用C语言的非上下文无关语言的例子有哪些?以下非CFL如何用C语言存在?

What are examples of non - context free languages in C language ? How the following non-CFL exists in C language ?


a)L1 = {wcw | w为{a,b} *}

a) L1 = {wcw|w is {a,b}*}

b)L2 = {a ^ nb ^ mc ^ nd ^ m | n,m> = 1}

b) L2 = {a^n b^m c^n d^m| n,m >=1}


推荐答案

问题措辞很笨拙,所以我字里行间,在这里阅读。

The question is clumsily worded, so I'm reading between the lines, here. Still, it's a common homework/study question.

C语法中通常出现的各种歧义[1]不会呈现语言。非上下文无关。 (实际上,它们甚至不会使语法具有非上下文无关的语法。)一般规则如果看起来像一个声明,则是一个声明,而不考虑其他可能的语法分析可能可以用非常复杂的上下文无关的语法进行整理。 (虽然并不是100%很明显是真的,因为CFG不会在交叉点或差异下闭合),但是使用更简单的CFG解析然后根据声明规则进行歧义比较容易。

The various ambiguities [1] in the C grammar as normally presented do not render the language non-context-free. (Indeed, they don't even render the grammars non-context-free.) The general rule "if it looks like a declaration, it's a declaration regardless of other possible parses" can probably be codified in a very complex context-free grammar (although it's not 100% obvious that that is true, since CFGs are not closed under intersection or difference), but it's easier to parse with a simpler CFG and then disambiguate according to the declaration rule.

现在,关于C(和大多数编程语言)的要点是,该语言的语法比用于解释目的的BNF复杂得多。例如,如果使用一个未定义的变量,则C程序格式不正确。这是语法错误,但CFG解析器未检测到。由于语言的语法复杂,定义这些情况所需的语法形式非常复杂,但是归根结底是要求id在有效程序中出现两次。因此, L1 = {wcw | w是{a,b} +} (此处 w 是标识符,而 c 太复杂了,无法说明。实际上,检查此要求通常是通过符号表来完成的,而形式语言规则虽然精确,却不是用逻辑形式主义编写的。由于 L1 不是上下文无关的语言,因此形式主义也可能不是上下文无关的,但是上下文敏感的语法可以识别 L1 ,因此并非完全不可能。 (例如,参见 Algol 68 。)

Now, the important point about C (and most programming languages) is that the syntax of the language is quite a bit more complex than the BNF used for explanatory purposes. For example, a C program is not well-formed if a variable is used without being defined. That's a syntax error, but it's not detected by the CFG parser. The grammatical productions needed to define these cases are quite complicated, due to the complicated syntax of the language, but they're going to boil down to requiring that ids appear twice in a valid program. Hence L1 = {wcw|w is {a,b}+} (here w is the identifier, and c is way too complicated to spell out). In practice, checking this requirement is normally done with a symbol table, and the formal language rules, while precise, are not written in a logical formalism. Since L1is not a context-free language, the formalism could not be context-free, but a context-sensitive grammar can recognize L1, so it's not totally impossible. (See, for example, Algol 68.)

符号表还用于确定是否将特定的标识符简化为 typedef-name [2]。这是解决语法中许多歧义所必需的。 (这也进一步限制了该语言的字符串集,因为在某些情况下,必须将 identifier 解析为 typedef-name 才能使程序有效。)

The symbol table is also used to decide whether a particular identifier is to be reduced to typedef-name [2]. This is required to resolve a number of ambiguities in the grammar. (It also further restricts the set of strings in the language, because there are some cases where an identifier must be resolved as a typedef-name in order for the program to be valid.)

对于另一种上下文相关类型,函数调用需要与中的函数声明匹配参数数量;这种需求可以用 L2 = {a ^ n b ^ m c ^ n d ^ m |来建模。 n,m> = 1} 其中, a c 表示定义和使用的功能,而 b d 代表不同功能的定义和使用。 (再次,以高度简化的形式。)

For another type of context-sensitivity, function calls need to match function declarations in the number of arguments; this sort of requirement is modelled by L2 = {a^n b^m c^n d^m| n,m >=1} where a and c represent the definition and use of some function, and b and d represent the definition and use of a different function. (Again, in a highly-simplified form.)

第二个条件可能不太清楚是一个语法要求。其他语言(例如Python)允许使用任意数量的参数进行函数调用,并将参数/参数计数匹配检测为仅在运行时检测到的语义错误。但是,对于C语言,不匹配显然是语法错误。

This second requirement is possibly less clearly a syntactic requirement. Other languages (Python, for example) allow function calls with any number of arguments, and detect a argument/parameter count match as a semantic error only detected at runtime. In the case of C, however, a mismatch is clearly a syntax error.

简而言之,构成C语言的语法有效字符串集是C语言的适当子集。由C语言定义中的CFG识别的一组字符串;有效解析集是CFG生成的派生集的适当子集,并且语言本身是(a)明确的,并且(b)不是上下文无关的。

In short, the set of grammatically valid strings which constitute the C language is a proper subset of the set of strings recognized by the CFG presented in the C language definition; the set of valid parses is a proper subset of the set of derivations generated by the CFG, and the language itself is (a) unambiguous, and (b) not context-free.

注1:其中大多数并不是真正的模棱两可,因为它们取决于给定的标识符的解析方式(typedef名称,函数标识符,声明的变量,... )。

Note 1: Most of these are not really ambiguities, because they depend upon how a given identifier is resolved (typedef name, function identifier, declared variable,...).

注2:不是将标识符解析为 typedef-name (如果恰好是一个);仅在可能减少的地方发生。对类型和变量使用相同的标识符,即使在相同的范围内也不是语法错误。 (这不是一个好主意,但这是有效的。)下面的示例改编自标准6.7.8节中的示例,展示了 t 的用法字段名称和typedef:

Note 2: It is not the case that identifier must be resolved as a typedef-name if it happens to be one; that only happens in places where the reduction is possible. It is not a syntax error to use the same identifier for both a type and a variable, even in the same scope. (It's not a good idea, but it's valid.) The following example, adapted from an example in section 6.7.8 of the standard, shows the use of t as both a field name and a typedef:

typedef signed int t;
struct tag {
    unsigned t:4;  // field named 't' of type unsigned int
    const t:5;     // unnamed field of type 't' (signed int)
};

这篇关于使用C语言的非上下文无关语言示例?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆