如何使用Saxon内置目录功能 [英] How to use saxon built-in catalog feature

查看:106
本文介绍了如何使用Saxon内置目录功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我下载了SaxonHE9-4-0-6J,并希望在CLI上处理XHTML。但是,Saxon尝试从W3C加载DTD,并且每个简单命令都花费太多时间。

I downloaded SaxonHE9-4-0-6J and want to process XHTML on CLI. However Saxon tries to load DTD from W3C and it takes too much time for every simple command.

我有xml目录,我可以通过set env变量指向将它成功用于xmllint目录文件,但我不知道如何使Saxon使用它。 Google揭示了与Saxon一起使用目录的整个变化历史(因此引起混乱),没有一个让我感到高兴。

I have xml catalog, which I use successfully with xmllint by set env variable pointing to catalog file, but I have no idea how to make Saxon use it. Google reveals whole history of changes (thus confusion) in regards of using catalogs with Saxon, and none made me happy.

我下载了resolver.jar并将其设置在CLASSPATH中,但是我无法让Saxon使用它。
经过各种组合,我跟随 http://www.saxonica.com /documentation/sourcedocs/xml-catalogs.xml ,只需使用目录变量,例如:

I downloaded resolver.jar and set it in my CLASSPATH, but I can't make Saxon use it. After various combinations, I followed http://www.saxonica.com/documentation/sourcedocs/xml-catalogs.xml by using just catalog variable, like:

-catalog:path-to- my-catalog

(尝试URI和常规路径),并且未设置 -r -x -y 开关,但Saxon没有看到它。我收到此错误:

(tried both URI and regular paths), and without setting -r, -x, -y switches, but Saxon doesn't see it. I get this error:


查询处理失败:无法加载Apache目录解析器

Query processing failed: Failed to load Apache catalog resolver library

resolver.jar已在我的类路径中设置,我可以从命令行使用它:

resolver.jar is set in my classpath and I can use it from command line:

C:\temp>java org.apache.xml.resolver.apps.resolver
Usage: resolver [options] keyword

Where:

-c catalogfile  Loads a particular catalog file.
-n name         Sets the name.
-p publicId     Sets the public identifier.
-s systemId     Sets the system identifier.
-a              Makes the system URI absolute before resolution
-u uri          Sets the URI.
-d integer      Set the debug level.
keyword         Identifies the type of resolution to perform:
                doctype, document, entity, notation, public, system,
                or uri.

OTOH,Saxon存档本身已经包含XHTML和其他各种DTD,因此必须有一种简单的方法

OTOH, Saxon archive itself already includes XHTML and various other DTDs, so there must be simple way out from this frustration.

如何在命令行上使用Saxon并指示其使用本地DTD?

How to use Saxon on command-line and instruct it to use local DTDs?

推荐答案

从问题中的saxonica链接:

From the saxonica link in your question:


在命令行上使用-catalog选项时,这会覆盖
(用于Saxon中的内部解析器)(从9.4开始),以将著名的
W3C引用(例如XHTML DTD)重定向到Saxon的
这些资源的本地副本。因为这两个功能都依赖于设置XML
解析器的EntityResolver,所以不可能在
连词中使用它们。

When the -catalog option is used on the command line, this overrides the internal resolver used in Saxon (from 9.4) to redirect well-known W3C references (such as the XHTML DTD) to Saxon's local copies of these resources. Because both these features rely on setting the XML parser's EntityResolver, it is not possible to use them in conjunction.

在我看来,Saxon自动使用知名的本地副本W3C DTD ,但是如果指定 -catalog ,则它不使用内部解析器,而必须在目录中明确指定这些解析器。

This sounds to me like Saxon automatically uses local copies of the well-known W3C DTDs, but if you specify -catalog, it does not use the internal resolver and you have to specify these explicitly in your catalog.

这是在撒克逊语中使用目录的一个有效示例...

Here's a working example of using a catalog with Saxon...

我的示例的文件/目录结构

C:/so_test/lib
C:/so_test/lib/catalog.xml
C:/so_test/lib/resolver.jar
C:/so_test/lib/saxon9he.jar
C:/so_test/lib/test.dtd
C:/so_test/test.xml

XML DTD so_test / lib / test.dtd

<!ELEMENT test (foo)>
<!ELEMENT foo (#PCDATA)>

XML实例 so_test / test.xml

请注意,系统标识符指向不存在的位置以确保正在使用目录。

<!DOCTYPE test PUBLIC "-//TEST//Dan Test//EN" "dir_that_doesnt_exist/test.dtd">
<test>
    <foo>Success!</foo>
</test>

XML目录 so_test / lib / catalog .xml

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <group prefer="public" xml:base="file:///C:/so_test/lib">
        <public publicId="-//TEST//Dan Test//EN" uri="lib/test.dtd"/>
    </group>
</catalog>

命令行

请注意 -dtd 选项以启用验证。

Note the -dtd option to enable validation.

C:\so_test>java -cp lib/saxon9he.jar;lib/resolver.jar net.sf.saxon.Query -s:"test.xml" -qs:"<results>{data(/test/foo)}</results>" -catalog:"lib/catalog.xml" -dtd

结果

<results>Success!</results>

如果我使XML实例无效:

If I make the XML instance invalid:

<!DOCTYPE test PUBLIC "-//TEST//Dan Test//EN" "dir_that_doesnt_exist/test.dtd">
<test>
    <x/>
    <foo>Success!</foo>
</test>

并运行与上面相同的命令行,结果如下:

and run the same command line as above, here is the result:

Recoverable error on line 4 column 6 of test.xml:
  SXXP0003: Error reported by XML parser: Element type "x" must be declared.
Recoverable error on line 6 column 8 of test.xml:
  SXXP0003: Error reported by XML parser: The content of element type "test" must match "(foo)".
Query processing failed: The XML parser reported two validation errors

希望此示例对您有帮助找出要更改的设置。

Hopefully this example will help you figure out what to change with your setup.

此外,使用 -t 选项还可以提供其他信息,例如加载了什么目录以及是否解析了公共标识符:

Also, using the -t option gives you additional information such as what catalog was loaded and if the public identifier was resolved:

Loading catalog: file:///C:/so_test/lib/catalog.xml
Saxon-HE 9.4.0.6J from Saxonica
Java version 1.6.0_35
Analyzing query from {<results>{data(/test/foo)}</results>}
Analysis time: 122.70132 milliseconds
Processing file:/C:/so_test/test.xml
Using parser org.apache.xml.resolver.tools.ResolvingXMLReader
Building tree for file:/C:/so_test/test.xml using class net.sf.saxon.tree.tiny.TinyBuilder
Resolved public: -//TEST//Dan Test//EN
        file:/C:/so_test/lib/test.dtd
Tree built in 0 milliseconds
Tree size: 5 nodes, 8 characters, 0 attributes
<?xml version="1.0" encoding="UTF-8"?><results>Success!</results>Execution time: 19.482079ms
Memory used: 20648808

其他信息

Saxon发行了Xerces的Apache版本,因此请使用 resolver.jar 可在 Apache Xerces发行版中找到。

Saxon distributes the Apache version of Xerces, so use the resolver.jar found in the Apache Xerces distribution.

这篇关于如何使用Saxon内置目录功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆