为什么需要端子?我的解决方案是否足够? [英] Why the need for terminals? Is my solution sufficient enough?

查看:69
本文介绍了为什么需要端子?我的解决方案是否足够?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力弄清上下文无关语法,我想我已经接近了。使我感到困惑的是这个问题(我一个月内要参加考试时正在做练习题):

I'm trying to get my head around context free grammars and I think I'm close. What is baffling me is this one question (I'm doing practise questions as I have an exam in a month's time):

我想出了这种语言,但我认为这是错误的。

I've come up with this language but I believe it's wrong.

S --> aSb | A | B
A --> aA | Σ
B --> bB | Σ

显然,这是正确的解决方案:

Apparently this is the correct solution:

S --> aSb | aA | bB
A --> aA | Σ
B --> bB | Σ

我不太明白为什么我们有 S->抗体aA | bB 而不仅仅是 S->抗体A | B 。终端需要什么?我不能只叫A而是那样抓住终端吗?

What I don't quite understand is why we have S --> aSb | aA | bB and not just S --> aSb | A | B. What is the need for the terminals? Can't I just call A instead and grab my terminals that way?

测试看看是否可以生成字符串:aaabbbb

S --> aSb --> aaSbb --> aaaSbbb --> aaaBbbb --> aaabbbb

我相信我可以正确生成字符串,但是我不确定。我告诉自己 S->的原因;抗体aA | bB 是如果我们从 aA 开始,然后将 A 替换为 a ,我们有两个 a 可以为我们提供正确的字符串,因为它们不相等,这可以用b来完成好。

I believe I generate the string correctly, but I'm not quite sure. I'm telling myself that the reason for S --> aSb | aA | bB is that if we start with aA and then replace A with a, we have two a's which gives us our correct string as they're not equal, this can be done with b as well. Any advice is greatly appreciated.

进入元组(G-4元组)

V (无端子)= {A,B}

Σ(端子)= {a,b}

P = {} //不太确定如何用R表达我的解决方案?我需要使用测试字符串吗?

S = A

V (None terminals) = {A, B}
Σ (Terminals) = {a, b}
P = { } // not quite sure how to express my solution in R? Would I have to use a test string to do so?
S = A

推荐答案

第一个:



  1. Σ表示语言符号。用您的语言Σ= {a,b}

  2. ^ 表示空符号(理论上, ^ 不是任何语言符号的成员)

  3. ε表示空字符串(理论上,ε 可以是是某种语言的成员)

  1. Σ means language symbols. in your language Σ = {a, b}
  2. ^ means null symbols (it is theoretical, ^ is not member of any language symbol)
  3. ε means empty string (it is theoretical, ε can be a member of some language)


请参见 ^ 符号没有任何意义,但我们仅将其用于理论目的,例如我们在数学中使用的无限符号(实际上没有数字是,但我们用它来理解,以证明一些定理)类似地, ^ 没什么,但我们使用它。

这点没有写在任何书中,我是为了解释/理解观点而写的。

See ^ symbol means nothing but we use it just for theoretical purpose, like infinity symbol we uses in mathematics(really no number is but we use it to understand, to proof some theorems) similarly ^ is nothing but we use it.
this point not written in any book, I am writing it to explain/for understanding point of view. The subject more belongs to theoretical and math and I am from computer science.

正如您所说,您的语法是 L = {a m b n | m!= n} 。假设产量如下:

第一:

As you says your grammar is L = {am bn | m != n}. Suppose if productions are as follows:
First:

S --> aSb | A | B
A --> aA | Σ
B --> bB | Σ

这意味着。(非常稀有的书可能会使用Σ

It means.(very rare book may use Σ in grammar rules)

S --> aSb | A | B
A --> aA | a | b
B --> bB | a | b

我用 a代替了Σ| b a b 语言符号)。

I replaced Σ by a | b (a, b language symbols).

此语法可以生成相等数量的符号 a 和符号 b a n b n )。如何生成 a n b n ?参见下面的示例推导:

This grammar can generates a string of equal numbers of symbols a and symbol b(an bn). How it can generate an bn? See below an example derivation:

S ---> aSb ---> aAb ---> aaAb ---> aabb  
   ^        ^        ^           ^
rule-1      S-->A    A--> aA     A --> b

但是这些字符串在语言L中是不可能的,因为m!= n。

But these kind of strings are not possible in language L because m != n.

第二个:

出于同样的原因,生产规则 S->抗体aA |如果 A-> bB 也不是正确的语法。 aA | Σ B-> bB | Σ在语法上。

Second:
For the same reason production rules S --> aSb | aA | bB is also not correct grammar if A --> aA | Σ or B --> bB | Σ are in grammar.

我认为在第二种语法中,您的意思是:

I think in second grammar you mean:

S --> aSb | aA | bB
A --> aA  | ^
B --> bB  | ^ 

这是语言 L = {a m b n | m!= n} 。因为使用:

Then this is correct grammar for language L = {am bn | m != n}. Because using:

S --> aSb  

您只能生成相等数量的 a ' b ,并替换为 S aA bB 的句子形式,其中 a个数不等 b 符号存在,并且不能转换回以生成 a n b n 。 (因为 A 不会生成 b ,而 B 会生成't生成 a )。

you can only generate equal numbers of a' and b and by replacing S either by aA or by bB you make a sentential form in which unequal numbers of a and b symbols are present and that can't convert back to generate a string of type an bn. (since A doesn't generates b and B doesn't generates a).

第三:

但是通常我们会写如下语法规则:

Third:
But usually we write grammar rules like:

S --> aSb | A | B
A --> aA | a
B --> bB | b

两种形式都是等效的(生成相同的语言 L = {a m b n | m!= n} ),因为一旦将 S 转换为任一 A B ,您必须生成至少一个 a b (或更多),因此约束m!= n成立。

Both forms are equivalent (generate same language L = {am bn | m != n}) because once you convert S into either A or B you have to generate at-least one a or b (or more) respectively and thus constraint m != n holds.

记住校对,无论两个语法是否相等,都是无法确定的问题。我们无法通过 算法来证明这一点(但在逻辑上是可行的,因为我们的大脑比处理器:P:更好)。

Remember proofing, whether two grammars are equivalent or not is undecidable problem. We can't prove it by algorithm (but logically possible, that works because we are human being having brain better then processor :P :) ).

第四:

最后,我还要添加语法:

Fourth:
At the end I would also like to add, Grammar:

S --> aSb | A | B
A --> aA | ^
B --> bB | ^

不产生 L = {a m b n | m!= n} ,因为我们可以生成 a n b n ,例如:

doesn't produces L = {am bn | m != n} because we can generate an bn for example:

S ---> aSb ---> aAb ---> ab
                     ^
                     A --> ^ 



正式语言语法



任何类形式语言可以由由四元组组成的形式语法表示。 code>(S,V,Σ,P)。 (请注意语法或自动机都是有限表示的,天气语言是有限或无限的:检查图形一个& 两个 )。

Grammar in formal languages

Any class of formal languages can be represented by a formal Grammar consisting of the four-tuple (S, V, Σ, P). (note a Grammar or an automata both are finite representation weather language is finite or infinite: Check figures one & two).


  1. Σ:一组有限的语言符号。

    在语法上,我们通常将其称为终端的有限集(与变量 V 形成对比)。语言符号或终端是事物,它使用哪些语言字符串(句子)来构造。在您的示例终端集中,Σ {a,b} 。用自然语言,您可以将终端与词汇或词典词相关联。

    自然语言是指我们说的印地语,英语

  1. Σ: Finite set of language symbols.
    In grammar we commonly call it finite set of terminals (in contrast of variables V). Language symbols or terminals are thing, using which language strings (sentences) are constructed. In your example set of terminals Σ is {a, b}. In natural language you can correlate terminals with vocabulary or dictionary words.
    Natural language means what we speak Hindi, English

V :有限的非终止词集合。

非终止词或说变量,应始终参与语法生成规则。 (否则该变量会计入无用的变量,即一个不会衍生出末尾或什么都不会得出的变量)。

请参阅:语法的最终目的是产生语言的字符串格式正确,因此每个变量都应该以某种方式有用。

在自然语言中,您可以将变量集与定义了语言特定语义属性的Noun / Verbs / Tens关联起来(像动词表示吃饭/睡觉,名词表示他/她/ Xiy等。)

注意:在某些书籍中可以找到 V∩ Σ=∅ ,表示变量不是终端。

V: Finite set of Non-terminals.
Non-terminal or say 'variable', should always participate in grammar production rules. (otherwise the variable counts in useless variables, that is a variable that doesn't derives terminals or nothing).
See: 'ultimate aim of grammar is to produce language's strings in correct form hence every variable should be useful in some way.
In natural language you can correlate variable set with Noun/Verbs/Tens that defined a specific semantical property of an language (like Verb means eating/sleeping, Noun means he/she/Xiy etc).
Note: One can find in some books V ∩ Σ = ∅ that means variables are not terminals.

S :开始变量。 ( S∈V

S 是一个特殊的变量符号,称为启动符号。如果只能从起始变量 S 派生出语法L(G),我们只能考虑它。如果不能从 S 派生一个字符串(即使它由语言符号Σ组成),则该字符串也不会考虑语法语言(实际上该字符串属于L(G)的补语,我们写补语 L ' * -L(G),检查:在使用常规语言的情况下的补充语言

S: Start Variable. (S ∈ V)
S is a special variable symbol, that is called 'Start Symbol'. We can only consider a string in language of grammar L(G) if it can be derived from Start variable S. If a string can not be derived from S (even if its consist of language symbols Σ) then string will not be consider in the language of grammar( actually that string belongs to 'complement language' of L(G), we writes complement language L' = Σ* - L(G) , Check: "the complement language in case of regular language")

P :有限的生产规则集

生产规则在α->中定义替换规则。 β,这意味着在从 S 导出字符串的过程中,随时可以从语法规则α(lhs)可以替换为β(rhs)。(这类似于Noun可以被他,她或Xiy替换,动词可以被替换为

生产规则定义了语言句子的形成规则,形式语言类似于自然语言,其模式是可以确定某种事物可以以某种形式发生-我们称其为语法

注意:在α->中;由于语法的这种能力,语法用于语法检查称为parse。 βαβ都由语言符号和终端符<$ c $组成c>(VUΣ)* 有一个约束,在α中,它们必须至少是一个变量。 (因为我们只能用rhs规则替换一个包含变量的字符串。一个终端不能被其他终端替代,或者我们可以说一个句子不能被其他句子替代

请记住:有两种形式句子形式和句子的字符串:

P: Finite set of Production Rules.
Production Rules defines replacement rules in the from α --> β, that means during the derivation of a string from S, from grammar rules at any time α (lhs) can be replaced by β (rhs).(this is similar to Noun can be replace by he,she or Xiy, and Verb can be replace by eating, sleeping etc in natural language.
Production rules defines formation rules of language sentences. Formal language are similar to Natural language having a pattern that is certain thing can occurs in certain form--that we call syntax in programming language. And because of this ability of grammar, grammar use for syntax checking called parse).
Note: In α --> β, α and β both are consists of language symbols and terminals (V U Σ)* with a constraint that in α their must be at-least one variable. (as we can replace only a string contain variable by rhs of rule. a terminal can't replace by other terminal or we can say a sentence can't be replaced by other sentence)
Remember: There is two form Sentential Form and Sentence of a string:


  • 句子: 如果所有符号都是结束符(句子可以是L(G)或补充语言 L ' * -L

  • 句子: (如果有任何符号变量) (不是语言字符串,而是派生字符串)

  • Sentence: if all symbols are terminals (sentence can be either in L(G) or in complement language L' = Σ* - L)
  • Sentential: if any symbol is variable (not a language string but derivation string)

来自@MAV(谢谢!):

表示上述语言的语法 L = {a m b n | m!= n} ,四元组是:

From @MAV (Thanks!!):
To represent grammar of above language L = {am bn | m != n}, 4-tuple are :


V = {S ,A,B}

Σ= {a,b}

P = {S->抗体A | B,A-> aA | a,B-> bB | a}

S = S

注意:通常,对于 P 归纳规则,通常使用 P ,您的书可能会使用 R 用于 r ules

note: Generally I use P for Production rules, your book may use R for rules



  • 大写字母用于变量,例如S,A,B的语法结构。

  • 终端使用的小写字母(语言符号),例如 a b
    使用某些时间数字,例如 0 1 。还 ^ 为空符号)。

  • 小写字母最后一次用于终端字符串 z y w x (例如,您可以在抽引引理中找到这些符号,
    符号用于语言字符串)或子字符串)。

  • α,β,γ for 句子形式

  • Σ符号。

  • Γ用于输入或输出抽头符号,然后是语言符号。

  • ^ 为空符号,图灵机和PDA中空白符号的符号( ^ 是其他语言符号。

  • ε用于空字符串(可以是语言字符串的一部分,例如 {} C 语言中为空体,您可以编写 while(1);
    while(1){} 均有效请参阅此处,我定义了一个有效的程序
    ,其中包含空句子
    )。

  • 表示在其中设置为空集合论

  • ΦΨ用于子字符串

  • Capital letters are uses for variables e.g. S, A, B in grammar construction.
  • Small letter from start uses for terminals(language symbols) for example a, b. (some time numbers like 0, 1 uses. Also ^ is null symbol).
  • Small letters form last uses for string of terminals z, y, w, x (for example you can find these notations in pumping lemma, symbols use for language string or sub strings).
  • α, β, γ for Sentential forms.
  • Σ for language symbols.
  • Γ for input or output tap symbol, other then language symbols.
  • ^ for null symbol, # or Symbol for blank symbol in Turing machine and PDA (^, #, are other then language symbols.
  • ε uses for empty string (can be a part of language string for example { } is empty body in C language, you can write while(1); or while(1){ } both are valid see here I have defined a valid program with empty sentences).
  • means empty set in set theory.
  • Φ, Ψ uses for substring in Sentential forms.

注意表示集合为空,ε表示字符串为空, ^ 表示没有符号(在理论上不要混用,语义上都不同

Note: means set is empty, ε means string is empty, ^ means none symbol (don't mix in theory, all are different in semantic)

我没有关于符号表示法的规则,但是一旦我在学习过程中观察到的大多数标准书籍中都能找到这些规则,它们就是常用的术语。

There is no rules I know about symbol notation, but these are commonly used terminology once can find in most standard books I observed during study.

下一篇文章:编写上下文无关语法的提示

这篇关于为什么需要端子?我的解决方案是否足够?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆