如何更换与 在一个html文件中 [英] How to replace   with   in an html file
问题描述
我想在我的html文件中将所有
替换为 
以支持XML解析器.
但是我不想直接替换它们,我想在<!DOCTYPE >
中添加一个实体,如下所示:
I want to replace all the
with  
in my html file to support XML parser.
But I don't want to replace them directly, I'd like to add an entity in <!DOCTYPE >
like below:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"[<!ENTITY nbsp " ">]>
<html><head></head><body><div>Hello World!</div></body></html>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"[<!ENTITY nbsp " ">]>
<html><head></head><body><div>Hello World!</div></body></html>
但是当我查看文件时,文档顶部还有一个]>
:
But when I view the file, there is an extra ]>
on the top of the document:
有人知道如何处理吗?
谢谢!
推荐答案
您拥有的是在内部子集中包含实体声明的有效方法.不过,该文档在其他方面无效,因为您可以使用 W3C标记验证器进行检查:必需的xmlns
属性缺少html
元素上的,因此缺少必需的title
属性.
What you have is a valid way to include an entity declaration in an internal subset. The document is not otherwise valid, though, as you can check with the W3C Markup Validator: the required xmlns
attribute on the html
element is missing, and so is the required title
attribute.
当用作text/html时,将处理浏览器如何使用该文档来处理HTML文档,这意味着内部子集无法被识别;实际上,根本不会读取文档类型定义,而是将doctype声明仅当作魔术字符串使用,因此某些字符串会触发怪异模式",而有些则不会. doctype声明以一种简单的方式进行解析,这使得第一个>"终止它,因此将其作为字符数据后的任何内容.
When served as text/html, the document is processed how browsers use to process HTML document, which means among other thing that internal subsets are not recognized; in fact, document type definitions are not read at all – instead, doctype declarations are just taken as magic strings so that some strings trigger "quirks mode", some don’t. The doctype declaration is parsed in a simplistic manner, which makes the first ">" terminate it, so whatever comes after it is taken as character data.
士气是,实体声明仅在内部或外部不与"HTML"一起使用,当"HTML"意味着将内容发送到浏览器并告知(在HTTP标头中)为text/html -就是这样服务器通常会告诉他们何时发送.html文件.
The morale is that entity declarations just don’t work with "HTML", internally or externally, when "HTML" means sending something to a browser and telling (in HTTP headers) it to be text/html – and that’s what servers normally tell when they send .html files.
用作application/xhtml + xml并已固定为符合XHTML语法,您的方法适用于符合要求的浏览器(在线演示:
Served as application/xhtml+xml and fixed to conform to XHTML syntax, your approach works on conforming browsers (online demo: http://www.cs.tut.fi/~jkorpela/test/nbsp.xhtml):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[<!ENTITY nbsp " ">]>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Entity demo</title></head>
<body>
<div>Hello World!</div>
</body>
</html>
但是,当用作应用程序/xhtml + xml时,IE 8和更早版本不处理HTML(浏览器仅会启动另存为"对话框).
However, IE 8 and earlier don’t process HTML when served as application/xhtml+xml (the browser just launches a "Save As" dialog).
结论取决于您在做什么,以及为什么(在哪种意义上)需要支持XML解析器".它不是真正的解析,而是关于实体声明. XHTML用户代理不需要像HTML一样理解预定义的实体(除了XML中定义的实体之外),但是以某种方式实现了这种可能性吗?通常,最好将
转换为实际的不间断空格字符,而不是转换为字符引用.
The conclusions depend on what you are doing and why (and in which sense) you need to "support XML parser". It’s not really about parsing but about entity declarations. XHTML user agents are not required to understand predefined entities as in HTML (except for those defined in XML), but has this possibility realized somehow? And in general, it is better to convert
to actual no-break space characters than to character references.
这篇关于如何更换与&#160;在一个html文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!