sgml到xml的转换 [英] sgml to xml conversion

查看:132
本文介绍了sgml到xml的转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的.sgm文件中包含以下示例sgml数据,我想将其转换为xml

I have a following sample sgml data from my .sgm file and I want convert this in to xml

<?dtd name="viewed">
<?XMLDOC>
<viewed >xyz
<cite>
<yr>2010
<pno cite="2010 abc 1188">10
<?/XMLDOC>

<?XMLDOC>
<viewed>abc.
<cite>
<yr>2010
<pno cite="2010 xyz 5133">9
<?/XMLDOC>

输出应如下所示:

<index1>
    <num viewed="xyz"/>
    <heading>xyz</heading>
    <index-refs>
      <link  caseno="2010 abc 1188</link>
    </index-refs>
  </index-1>
<index1>
    <num viewed="abc"/>
    <heading>abc</heading>
    <index-refs>
      <link  caseno="2010 xyz 5133</link>
    </index-refs>
  </index-1>

这可以用c#完成吗,还是可以使用xslt 2.0进行这种转换?

Can this be done in c# or can we use xslt 2.0 to do this kind of conversion?

推荐答案

其他人已经给出了一些好的建议。这是一种通过首先将输入SGML转换为格式正确的XML,然后使用XSLT将其转换为所需的确切格式的方法。

Others have already given some good advice. Here's one way of putting it all together by first converting the input SGML to well-formed XML and then using XSLT to transform that to the exact format you need.

将SGML转换为格式正确的XML

通过osx 工具 http://openjade.sourceforge.net/ rel = nofollow noreferrer> OpenSP 包是一个很好的工具。由于SGML标记省略了结束标记,因此您需要具有一个DTD,从中可以确定元素的正确嵌套。如果您没有DTD,则需要创建一个。对于您的示例输入,它可能像这样简单:

The osx tool from the OpenSP package suggested by mzjn is a good tool for this. Since your SGML markup omits end tags, you need to have a DTD from which the correct nesting of elements can be determined. If you don't have a DTD, you need to create one. For your example input, it could be as simple as this:

<!ELEMENT toplevel o o (viewed)+>

<!ELEMENT viewed - o (#PCDATA,cite)>
<!ELEMENT cite - o (yr,pno)>
<!ELEMENT yr - o (#PCDATA)>
<!ELEMENT pno - o (#PCDATA)>

<!ATTLIST pno cite CDATA #REQUIRED>

您还需要在SGML文件的开头添加适当的文档类型声明。假设您的DTD在文件 viewed.dtd 中。

You also need to add a proper doctype declaration to the beginning of your SGML file. Assuming you have your DTD in file viewed.dtd.

<!DOCTYPE toplevel SYSTEM "viewed.dtd" >

现在,您应该可以使用 osx 将SGML转换为XML。 (由于XML中不允许使用以 / 开头的处理指令,它将无法进行转换,并会发出警告。)

With this addition, you should now be able use osx to convert the SGML to XML. (It won't be able to convert the processing instructions which start with a / as those are not allowed in XML, and will emit a warning about them.)

osx input.sgm > input.xml

将生成的XML转换为所需格式

对于上述情况,您可以使用以下XSLT样式表:

For the above case, you could use something like the following XSLT stylesheet:

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="VIEWED">
    <index1>
      <num viewed="{normalize-space(text())}"/>
      <heading>
        <xsl:value-of select="normalize-space(text())"/>
      </heading>
      <index-refs>
        <xsl:apply-templates select="CITE"/>
      </index-refs>
    </index1>
  </xsl:template>

  <xsl:template match="CITE">
    <link caseno="{PNO/@CITE}"/>
  </xsl:template>

</xsl:stylesheet>

这篇关于sgml到xml的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆