HXT忽略HTML DTD,将其替换为XML DTD [英] HXT ignoring HTML DTD, replacing it with XML DTD

查看:103
本文介绍了HXT忽略HTML DTD,将其替换为XML DTD的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难弄清楚为什么HXT会替换我的DTD。首先,这是我的输入文件被解析:

 <!DOCTYPE html> 
< html>
< head>
< title> foo< / title>
< / head>
< body>
< h1> foo< / h1>
< / body>
< / html>

这是我得到的输出:

 <?xml version =1.0encoding =US-ASCII?> 
< html>
< head>
< title> foo< / title>
< / head>
< body>
< h1> foo< / h1>
< / body>
< / html>

最后,这是我正在使用的箭头的简化版本:

  start(App src dest)= runX $ 
readDocument [withValidate no
,withSubstDTDEntities no
,withParseHTML yes
- ,withTagSoup
]
src
>>>
this
>>>
writeDocument [withIndent是
,withSubstDTDEntities no
,withOutputHTML
- ,withOutputEncodingUTF-8
]
dest

我对评论道歉 - 我一直在玩弄不同的configs组合。我似乎无法让HXT与DTD混淆,即使使用 withSubstDTDEntities no withValidate no code>等。我收到警告说HXT忽略了我的doctype声明,但这是我唯一的洞察力。任何人都可以请借我一只手吗?

解决方案

您有两个问题:

HXT只接受以下三种html文档类型之一

 <!DOCTYPE html 
PUBLIC - // W3C // DTD XHTML 1.0 Strict // EN
DTD / xhtml1-strict.dtd>

<!DOCTYPE html
PUBLIC - // W3C // DTD XHTML 1.0 Transitional // EN
DTD / xhtml1-transitional.dtd>

<!DOCTYPE html
PUBLIC - // W3C // DTD XHTML 1.0 Frameset // EN
DTD / xhtml1-frameset.dtd>

使用其中的一个将摆脱关于忽略dtd的警告。



其次,添加以下选项来写入文档

  withAddDefaultDTD yes 


I'm having a bit of trouble figuring out why HXT is replacing my DTD's. Firstly, here is my input file to be parsed:

<!DOCTYPE html>
<html>
  <head>
    <title>foo</title>
  </head>
  <body>
    <h1>foo</h1>
  </body>
</html>

and this is the output that I get:

<?xml version="1.0" encoding="US-ASCII"?>
<html>
  <head>
    <title>foo</title>
  </head>
  <body>
    <h1>foo</h1>
  </body>
</html>

Finally, here is a simplified version of the arrows I'm using:

start (App src dest) = runX $
                         readDocument [ withValidate no
                                      , withSubstDTDEntities no
                                      , withParseHTML yes
                                      --, withTagSoup
                                      ]
                                      src
                         >>>
                         this
                         >>>
                         writeDocument [ withIndent yes
                                       , withSubstDTDEntities no
                                       , withOutputHTML
                                       --, withOutputEncoding "UTF-8"
                                       ]
                                       dest

I apologize for the comments - I've been toying with different combinations of configs. I just can't seem to get HXT to not mess with DTDs, even with withSubstDTDEntities no, withValidate no, etc. I am getting a warning saying that HXT is ignoring my doctype declaration, but that's the only bit of insight I have. Can anyone please lend me a hand? Thank you in advance!

解决方案

You have two problems

HXT only accepts one of the following three html doctypes

<!DOCTYPE html 
 PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
 "DTD/xhtml1-strict.dtd">

<!DOCTYPE html
 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 "DTD/xhtml1-transitional.dtd">

<!DOCTYPE html
 PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
 "DTD/xhtml1-frameset.dtd">

Using one of these will get rid of the warning about ignoring the dtd.

Second, add the following option to writeDocument

withAddDefaultDTD yes

这篇关于HXT忽略HTML DTD,将其替换为XML DTD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆