如何确定给定DTD是否为另一个DTD的子集? [英] How can I determine if a given DTD a subset of another?

查看:40
本文介绍了如何确定给定DTD是否为另一个DTD的子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要验证简化的 DTD确实是较大DTD的子集,即根据简化的 DTD有效的文档也将根据较大的(或主)DTD始终有效。

I need to verify that a "simplified" DTD is really a subset of a larger DTD, I.e. that documents that are valid according to the "simplified" DTD will also always be valid according to the larger (or "master") DTD.

正在编写简化的DTD现在-它是从主DTD派生而来的(反过来,可以将较小的DTD包含到较大的DTD中)。

The simplified DTD is being written now -- it's derived from the master DTD (were it the other way around, one could simply include the smaller DTD into the larger one).

我怎么能确定简化的DTD是否源自主DTD?

How would I be able to determine if the simplified DTD is derived from the master DTD?

推荐答案

DTD实际上只是伪装的无上下文语法。语法G代表可能的合法字符串集,其中包括该语法代表的未声明语言L(G)。

DTDs are really just context-free grammars in disguise. A grammar G represents the set of possible legal strings that comprise the unstated language L(G) that grammar represents.

您要询问的内容等同于确定您是否拥有G1和G2,L(G1)是否是L(G2)的子集。我的语言理论日趋生锈,我不记得它是否可以通用计算,但是我想这真的很难,因为您必须证明G1的任意派生总是G2的派生。

What you are asking is tantamount to determining if you have G1 and G2, whether L(G1) is a subset of L(G2). My language theory is getting rusty and I don't remember if this is computable in general or not, but my guess this is really hard, because you have to demonstrate that an arbitrary derivation in G1 always has a derivation in G2.

您可能能够回答以下问题:G1是否以这样的方式构造:可以通过证明每个L(G1)是L(G2)的子集来证明通过显示G1中的每个语法规则
在G2中具有对应的规则而元素被删除,G1的元素与G2的每个元素都兼容。区分DTD的想法似乎与这条路线一致,但附带条件是,如果差异较大,则您将遇到一般性问题,而不是较简单的问题。至少您对问题的描述方式(G2来自主DTD),我认为您有机会。
diff的目的是通过找到最小的差异来识别兼容的规则。

You might be able to answer the question of whether G1 is structured in such a way that you can demonstrate that L(G1) is a subset of L(G2) by demonstrating that each element of G1 is compatible with each element of G2, essentialy by showing that each grammar rule in G1 has a corresponding rule in G2 with elements dropped. Your idea of diffing DTDs seems to be along this line, with the proviso that if the the diffs are large you are stuck with the general problem rather than the simpler one. At least the way you've characterized the problem (G2 is derived from the master DTD) I think you have chance. The purpose of the diff would be to identify compatible rules by finding the least differences.

如果您有语法规则g2 = A;而另一个g1 =您要声明的A与您要检查的
相关,
首先您需要证明A在G1中派生的字符串标记是一个超集$ b G2中派生的令牌A的字符串$ b。看起来就像比较两个语言的原始无约束问题。现在,我们只比较两个规则g1和g2的子语言。

If you have grammar rule g2 = A ; and another g1 = A that you'd claim are related and that you'd like to check, you'd first have to demonstrate that the string tokens that A derived in G1 is a superset of the string of tokens A derived in G2. This looks just like the original unconstrained problem of comparing two langauges; we're now just comparing the sublanguages for the two rules g1 and g2.

因此,我认为您必须坚持要使g1可以到达的每个子规则在结构上与g2中的相应子规则兼容。
我想您可能可以编写一个递归过程来检查这一点。该过程最需要帮助的是您倾向于在LALR解析器生成器中找到的所有集合运算符(FirstOf,..)。

So now I think you have to insist that each subrule reachable by g1 is compatible structurally with a corresponding subrule in g2 to make this practical. I think you can probably write a recursive procedure to check this. What this procedure mostly needs as help is all the set operators (FirstOf, ..) that you tend to find in an LALR parser generator.

在另一个方面,我的公司制造了智能差异器工具,该工具可以根据语言元素和编辑操作来计算语言结构上的差异。在那些元素上。它由语言定义参数化。 SmartDifference当前可用于多种常规语言(C,C ++,C#,COBOL,Java,PHP,Python等)。 XML(和DTD)也是一种语言,为此我们有一种语言定义,并且我们已经建立了实验性的XML Smart Differencer工具。它应该可以在DTD上正常工作。

On a different front, my company makes Smart Differencer tools, that compute deltas over language constructs in terms of langauge elements and editing operations on those elements. It is parameterized by language definitions. The SmartDifference presently works for a variety of conventional languages (C, C++, C#, COBOL, Java, PHP, Python, ....). XML (and DTDs) are a language, too, for which we have a language definition, and we've built an experimental XML Smart Differencer tools. It ought to work on DTDs just fine. Contact me offline (see bio) if you have further direct interest.

编辑:仅出于笑容,我尝试了以下两个DTD,一个是从另一个衍生而来的:

Just for grins, I tried the following two DTDs, one derived from the other:

orderform.xml

<?xml version='1.0' ?>
<!DOCTYPE orderform [

<!ELEMENT orderform (name,company,address,items) >
<!ELEMENT name ( firstname, lastname )>
<!ELEMENT firstname ( #PCDATA )>
<!ELEMENT lastname ( #PCDATA )>
<!ELEMENT company ( #PCDATA )>
<!ELEMENT address ( street, city, country )>
<!ELEMENT street ( #PCDATA )>
<!ELEMENT city( #PCDATA )>
<!ELEMENT country ( zipcode | nation )>
<!ELEMENT zipcode ( #PCDATA )>
<!ELEMENT nation ( #PCDATA )>
<!ELEMENT items (item)+ >
<!ELEMENT item ( partnumber, quantity, unitprice)>
<!ELEMENT partnumber ( #PCDATA )>
<!ELEMENT quantity ( #PCDATA )>
<!ELEMENT unitprice  ( #PCDATA )>
]>

<done/>

orderform2.xml

<?xml version='1.0' ?>
<!DOCTYPE orderform [

<!ELEMENT orderform (name,company,location,item) >
<!ELEMENT name ( firstname, lastname )>
<!ELEMENT firstname ( #PCDATA )>
<!ELEMENT lastname ( #PCDATA )>
<!ELEMENT company ( #PCDATA )>
<!ELEMENT location ( street, city, country )>
<!ELEMENT street ( #PCDATA )>
<!ELEMENT city( #PCDATA )>
<!ELEMENT country ( zipcode | nation )>
<!ELEMENT zipcode ( #PCDATA )>
<!ELEMENT nation ( #PCDATA )>
<!ELEMENT item ( partnumber, unitprice)>
<!ELEMENT partnumber ( #PCDATA )>
<!ELEMENT quantity ( #PCDATA )>
<!ELEMENT unitprice  ( #PCDATA )>
]>

<done/>

[看看是否可以自己发现差异,首先:-)

[See if you can spot the differences yourself, first :-)

并运行XML SmartDiffferencer:

And ran the XML SmartDifferencer:

C:\DMS\Domains\XML\Analyzers\SmartDifferencer\Source>DMSSmartDifferencer XML -SuppressSourceCodeForRenamings C:\DMS\Domains\XML\Tool
s\DTD2COBOL\orderform.xml C:\DMS\Domains\XML\Tools\DTD2COBOL\orderform2.xml
Copyright (C) 2009 Semantic Designs; All Rights Reserved
XML SmartDifferencer Version 1.1.1
Copyright (C) 2009 Semantic Designs, Inc; All Rights Reserved; SD Confidential
Powered by DMS (R) Software Reengineering Toolkit
*** Unregistered SmartDifferencer Version 1.1
*** Operating with evaluation limits.

*** Parsing file C:/DMS/Domains/XML/Tools/DTD2COBOL/orderform.xml ...
*** Parsing file C:/DMS/Domains/XML/Tools/DTD2COBOL/orderform2.xml ...
*** Creating suffix tree ...
*** Determining maximal pairs ...
*** Sorting maximal pairs ...
*** Determining differences ...
*** Printing edits ...
Rename 4.1-9.44 to 4.1-9.45 with 'address'->'location' and 'items'~>'item'
Delete 15.1-15.25 merging 15.18-15.21 into 4.44-4.47
<<!ELEMENT items (item)+ >
Delete 16.30-16.38 merging 16.30-16.38 into 15.18-15.28 with 'quantity'~>'partnumber'
<                             quantity,

是的,这就是我所做的操作。 (符号N.M表示 N行,M列)。

Yep, that's what I did to get the derived one. (The notation N.M means "line N, column M").

这篇关于如何确定给定DTD是否为另一个DTD的子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆