如何更换与 在一个html文件中 [英] How to replace   with   in an html file

查看:83
本文介绍了如何更换与 在一个html文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在我的html文件中将所有&nbsp;替换为&#160;以支持XML解析器. 但是我不想直接替换它们,我想在<!DOCTYPE >中添加一个实体,如下所示:

I want to replace all the &nbsp; with &#160; in my html file to support XML parser. But I don't want to replace them directly, I'd like to add an entity in <!DOCTYPE > like below:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"[<!ENTITY nbsp "&#160;">]> <html><head></head><body><div>Hello&nbsp;World!</div></body></html>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"[<!ENTITY nbsp "&#160;">]> <html><head></head><body><div>Hello&nbsp;World!</div></body></html>

但是当我查看文件时,文档顶部还有一个]>:

But when I view the file, there is an extra ]> on the top of the document:

有人知道如何处理吗?

谢谢!

推荐答案

您拥有的是在内部子集中包含实体声明的有效方法.不过,该文档在其他方面无效,因为您可以使用 W3C标记验证器进行检查:必需的xmlns属性缺少html元素上的,因此缺少必需的title属性.

What you have is a valid way to include an entity declaration in an internal subset. The document is not otherwise valid, though, as you can check with the W3C Markup Validator: the required xmlns attribute on the html element is missing, and so is the required title attribute.

当用作text/html时,将处理浏览器如何使用该文档来处理HTML文档,这意味着内部子集无法被识别;实际上,根本不会读取文档类型定义,而是将doctype声明仅当作魔术字符串使用,因此某些字符串会触发怪异模式",而有些则不会. doctype声明以一种简单的方式进行解析,这使得第一个>"终止它,因此将其作为字符数据后的任何内容.

When served as text/html, the document is processed how browsers use to process HTML document, which means among other thing that internal subsets are not recognized; in fact, document type definitions are not read at all – instead, doctype declarations are just taken as magic strings so that some strings trigger "quirks mode", some don’t. The doctype declaration is parsed in a simplistic manner, which makes the first ">" terminate it, so whatever comes after it is taken as character data.

士气是,实体声明仅在内部或外部不与"HTML"一起使用,当"HTML"意味着将内容发送到浏览器并告知(在HTTP标头中)为text/html -就是这样服务器通常会告诉他们何时发送.html文件.

The morale is that entity declarations just don’t work with "HTML", internally or externally, when "HTML" means sending something to a browser and telling (in HTTP headers) it to be text/html – and that’s what servers normally tell when they send .html files.

用作application/xhtml + xml并已固定为符合XHTML语法,您的方法适用于符合要求的浏览器(在线演示:

Served as application/xhtml+xml and fixed to conform to XHTML syntax, your approach works on conforming browsers (online demo: http://www.cs.tut.fi/~jkorpela/test/nbsp.xhtml):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
 [<!ENTITY nbsp "&#160;">]>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Entity demo</title></head>
<body>
  <div>Hello&nbsp;World!</div>
</body>
</html>

但是,当用作应用程序/xhtml + xml时,IE 8和更早版本不处理HTML(浏览器仅会启动另存为"对话框).

However, IE 8 and earlier don’t process HTML when served as application/xhtml+xml (the browser just launches a "Save As" dialog).

结论取决于您在做什么,以及为什么(在哪种意义上)需要支持XML解析器".它不是真正的解析,而是关于实体声明. XHTML用户代理不需要像HTML一样理解预定义的实体(除了XML中定义的实体之外),但是以某种方式实现了这种可能性吗?通常,最好将&nbsp;转换为实际的不间断空格字符,而不是转换为字符引用.

The conclusions depend on what you are doing and why (and in which sense) you need to "support XML parser". It’s not really about parsing but about entity declarations. XHTML user agents are not required to understand predefined entities as in HTML (except for those defined in XML), but has this possibility realized somehow? And in general, it is better to convert &nbsp; to actual no-break space characters than to character references.

这篇关于如何更换与&amp;#160;在一个html文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆