打开extern DTD(w3.org,xhtml1-transitional.dtd)时发生错误。 503服务器不可用 [英] An error has occurred opening extern DTD (w3.org, xhtml1-transitional.dtd). 503 Server Unavailable

查看:114
本文介绍了打开extern DTD(w3.org,xhtml1-transitional.dtd)时发生错误。 503服务器不可用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对xhtml文档进行xpath查询。使用.NET 3.5。

I'm trying to do xpath queries over an xhtml document. Using .NET 3.5.

文档如下所示:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
   ....
  </head>
  <body>
    ...
  </body>
</html>

因为该文档包含各种字符实体(& nbsp; 等),我需要使用DTD,以便使用 XmlReader 。所以我的代码看起来像这样:

Because the document includes various char entities (&nbsp; and so on), I need to use the DTD, in order to load it with an XmlReader. So my code looks like this:

var s = File.OpenRead(fileToRead)
var reader = XmlReader.Create(s, new XmlReaderSettings{ ProhibitDtd=false });

但是当我运行它时,它返回

But when I run this, it returns

打开外部DTD' http: //www.w3.org/TR/xhtml1-transitional.dtd ':远程服务器返回错误:(503)服务器不可用。

An error has occurred while opening external DTD 'http://www.w3.org/TR/xhtml1-transitional.dtd': The remote server returned an error: (503) Server Unavailable.

现在,我知道为什么我遇到了503错误。 W3C解释得很清楚

Now, I know why I am getting the 503 error. W3C explained it very clearly.

我看到了替代方法,人们只是禁用了DTD。这就是 ProhibitDtd = true 可以执行的操作,它消除了503错误。

I've seen "workarounds" where people just disable the DTD. This is what ProhibitDtd=true can do, and it eliminates the 503 error.

但是在我的情况下,这会导致其他问题-该应用程序无法获得实体定义,因此格式也不正确。如何在不访问w3.org网站的情况下使用DTD进行验证并获取实体定义?

But in my case that leads to other problems - the app doesn't get the entity defintions and so isn't well-formed XML. How can I validate with the DTD, and get the entity definitions, without hitting the w3.org website?

我认为.NET 4.0具有强大的内置功能来处理这种情况: XmlPreloadedResolver 。但是我需要.NET 3.5的解决方案。

I think .NET 4.0 has a nifty built-in capability to handle this situation: the XmlPreloadedResolver. But I need a solution for .NET 3.5.

相关:

-java.io.IOException:服务器返回的HTTP响应代码:503

推荐答案

我必须提供自己的 XmlResolver 。我认为这不是.NET 3.5内置的。莫名其妙。同样令人困惑的是,我花了这么长时间才发现这个问题。我还找不到其他已经解决了这个问题的人,这也令人感到困惑。

The answer is, I have to provide my own XmlResolver. I don't think this is built-in to .NET 3.5. That's baffling. It's also baffling that it has taken me this long to stumble onto this problem. It's also baffling that I couldn't find someone else who solved this problem already?

好,.. XmlResolver。我创建了一个新类,该类派生自XmlResolver,并重写了三个关键事项:凭据(设置),ResolveUri和GetEntity。

Ok, so.. the XmlResolver. I created a new class, derived from XmlResolver and over-rode three key things: Credentials (set), ResolveUri and GetEntity.

public sealed class XhtmlResolver : XmlResolver
{
    public override System.Net.ICredentials Credentials
    {
        set { throw new NotSupportedException();}
    }

    public override object GetEntity(Uri absoluteUri, string role, Type t)
    {
       ...
    }

    public override Uri ResolveUri(Uri baseUri, string relativeUri)
    {
      ...
    }
}

关于这些东西的文档非常轻描淡写,所以我告诉你我学到了什么。此类的操作如下:XmlReader将首先调用ResolveUri,然后在给定已解析的Uri的情况下,然后将调用GetEntity。该方法应返回类型为t的对象(作为参数传递)。我只看到它请求一个System.IO.Stream。

The documentation on this stuff is pretty skimpy, so I'll tell you what I learned. The operation of this class is like so: the XmlReader will call ResolveUri first, then, given a resolved Uri, will then call GetEntity. That method is expected to return an object of Type t (passed as a param). I have only seen it request a System.IO.Stream.

我的想法是使用csc.exe / resource <将DTD及其对XHTML1.0的依赖关系的本地副本嵌入到程序集中。 / code>选项,然后检索该资源的流。

My idea is to embed local copies of the DTD and its dependencies for XHTML1.0 into the assembly, using the csc.exe /resource option, and then retrieve the stream for that resouce.

private System.IO.Stream GetStreamForNamedResource(string resourceName)
{
    Assembly a = Assembly.GetExecutingAssembly();
    return  a.GetManifestResourceStream(resourceName);
}

非常简单。从GetEntity()调用此方法。

Pretty simple. This gets called from GetEntity().

但我可以对此进行改进。我没有以明文形式嵌入DTD,而是先压缩了它们。然后像这样修改上面的方法:

But I can improve on that. Instead of embedding the DTDs in plaintext, I gzipped them first. Then modify the above method like so:

private System.IO.Stream GetStreamForNamedResource(string resourceName)
{
    Assembly a = Assembly.GetExecutingAssembly();
    return  new System.IO.Compression.GZipStream(a.GetManifestResourceStream(resourceName), System.IO.Compression.CompressionMode.Decompress);
}

该代码打开嵌入式资源的流,并返回为以下内容配置的GZipStream减压。读者将获得纯文本DTD。

That code opens the stream for an embedded resource, and returns a GZipStream configured for decompression. The reader gets the plaintext DTD.

我想做的是仅解析Xhtml 1.0中DTD的URI。因此,我编写了ResolveUri和GetEntity来查找那些特定的DTD,并仅对它们进行肯定的响应。

What I wanted to do is resolve only URIs for DTDs from Xhtml 1.0. So I wrote the ResolveUri and GetEntity to look for those specific DTDs, and respond affirmatively only for them.

对于带有DTD语句的XHTML文档,流程如下:

For an XHTML document with the DTD statement, the flow is like this;


  1. XmlReader使用XHTML DTD的公共URI调用ResolveUri,这是-// W3C // DTD XHTML 1.0 Transitional // EN 。如果XmlResolver可以解析,它应该返回...一个有效的URI。如果无法解决,则应抛出。我的实现只是抛出公共URI。

  1. XmlReader calls ResolveUri with the public URI for the XHTML DTD, which is "-//W3C//DTD XHTML 1.0 Transitional//EN". If the XmlResolver can resolve, it should return... a valid URI. If it cannot resolve, it should throw. My implementation just throws for the public URI.

XmlReader然后使用DTD的系统标识符调用ResolveUri,在本例中为 http://www.w3。 org / TR / xhtml1 / DTD / xhtml1-transitional.dtd 。在这种情况下,XhtmlResolver返回一个有效的Uri。

XmlReader then calls ResolveUri with the System Identifier for the DTD, which in this case is "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd". In this case, the XhtmlResolver returns a valid Uri.

XmlReader然后使用该URI调用GetEntity。 XhtmlResolver获取嵌入式资源流并返回它。

XmlReader then calls GetEntity with that URI. XhtmlResolver grabs the embedded resource stream and returns it.

依赖项也会发生相同的情况-xhtml_lat1.ent,依此类推。为了使解析器正常工作,所有这些东西都需要嵌入。

The same thing happens for the dependencies - xhtml_lat1.ent, and so on. In order for the resolver to work, all those things need to be embedded.

是的,如果解析器无法解析URI,则应该抛出异常。据我所知,这还没有正式记录。似乎有点令人惊讶。 (严重违反了最小惊讶原则)。相反,如果ResolveUri返回null,则XmlReader将在null URI上调用GetEntity,这是没有希望的。

And yes, if the Resolver cannot resolve a URI, it is expected to throw an Exception. This isn't officially documented as far as I could see. It seems a bit surprising. (An egregious violation of the principle of least astonishment). If instead, ResolveUri returns null, the XmlReader will call GetEntity on the null URI, which .... ah, is hopeless.

这对我有用。它适用于<.net>在XHTML上进行XML处理的任何人。如果要在自己的应用程序中使用它,请获取DLL 。该zip包含完整的源代码。根据 MS公共许可证获得许可。

This works for me. It should work for anyone who does XML processing on XHTML from .NET. If you want to use this in your own applications, grab the DLL. The zip includes full source code. Licensed under the MS Public License.

您可以将其插入使用XHTML的XML应用程序中。像这样使用它:

You can plug it into your XML apps that fiddle with XHTML. Use it like this:

// for an XmlDocument...
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
doc.XmlResolver = new Ionic.Xml.XhtmlResolver();
doc.Load(xhtmlFile);

// for an XmlReader...
var xmlReaderSettings = new XmlReaderSettings
    {
        ProhibitDtd = false,
        XmlResolver = new XhtmlResolver()
    };
using (var stream = File.OpenRead(fileToRead))
{
    XmlReader reader = XmlReader.Create(stream, xmlReaderSettings);
    while (reader.Read())
    {
     ...
    }

这篇关于打开extern DTD(w3.org,xhtml1-transitional.dtd)时发生错误。 503服务器不可用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆