防止UTF-8 Parser转换实体? [英] Preventing the UTF-8 Parser from converting an entity?

查看:54
本文介绍了防止UTF-8 Parser转换实体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,


我有一点问题,我们使用的UTF-8解析器转换了

换行实体(
)在我们用来支持的属性中

CSS限制。


解析器经过文档后,实体被转换

到\ n,然后有效地抛出窗口我们的行为

通过keepinig实体获取文件。


周围有干净简单的方法吗?


我们非常感谢任何帮助。


问候

Jean-Francois Michaud

解决方案

* Jean-Fran?ois Michaud在comp.text.xml中写道:
< blockquote class =post_quotes>
>我有一点问题,我们使用的UTF-8解析器转换了
换行实体(
)在我们用来支持CSS限制的属性中。



我不明白你的问题。第一,
 不是实体,而是

数字字符引用。其次,处理它们与

字符编码(如UTF-8)无关。第三,我不知道你可能在这里指的是什么CSS限制



>解析器完成之后文档,实体被转换为\ n,然后通过保持文档中的实体来有效地抛出窗口我们正在获取的行为。



什么是\ n这里?你是什​​么意思被转换?你是什​​么意思

保留它?在XML规范中解释了在属性值中处理空白字符和字符引用

。 XML

处理器将它们保持在重要程度。如果您将处理器连接到序列化程序,输入和输出文件

将在规范上等效,除非其中一个有错误。那么

在这里应该没问题。

-

Bj?rn H?hrmann·mailto:bj **** @ hoehrmann .de· http://bjoern.hoehrmann.de

Weinh。海峡。 22·Telefon:+49(0)621/4309674· http://www.bjoernsworld.de

68309曼海姆·PGP Pub。 KeyID:0xA4357E78· http://www.websitedev.de/





Jean-Fran?ois Michaud写道:


我有一点问题,我们使用的UTF-8解析器转换了

换行实体(
)在我们用来支持的属性中

CSS限制。




 不是实体也不是实体引用,而是数字

字符引用。

什么是UTF-8解析器?


解析器经过文档后,实体被转换为
到\ n,然后有效地抛出窗口我们的行为

通过keepinig获取文档中的实体。



目前尚不清楚你使用什么样的工具以及你最终生产什么

但是如果要序列化DOM或XSLT结果树到XML标记

并希望将换行符转义为
 作为一个数字

字符引用,你需要一个XML序列化程序来做到这一点。如果您希望将这样的树序列化为HTML标记,那么您需要一个HTML

序列化程序来实现这一点。


- -


Martin Honnen
http:// JavaScript.FAQTs.com/


文章< 11 ******************** **@e3g2000cwe.googlegroups。 com>,

Jean-Fran?ois Michaud< co ***** @ comcast.netwrote:


>之后解析器已遍历文档,实体被转换为\ n,然后有效地抛出窗口我们通过保持文档中的实体来获取的行为。


>有一个干净简单的方法吗?



不使用XML。有效地需要XML应用程序来处理内容中的
字符引用,就像它们对待所引用的

字符一样。一个符合要求的XML解析器会按照你描述的方式将它转换为




如果你想拥有一个类似于换行符但被处理的东西

不同,然后字符引用不是正确的方法。

这不是他们的用途。使用诸如< nl /之类的元素可能是
a更好的解决方案。


- Richard


Hello all,

I''m having a little problem, The UTF-8 parser we are using converts the
newline entity ( ) within an attribute that we are using to paliate
CSS limitations.

After the parser has gone through the document, the entity is converted
to \n, which then effectively tosses out the window the behavior we are
getting by keepinig the entity AS IS within the document.

Is there a clean and easy way around this?

Any help will be greatly appreciated.

Regards
Jean-Francois Michaud

解决方案

* Jean-Fran?ois Michaud wrote in comp.text.xml:

>I''m having a little problem, The UTF-8 parser we are using converts the
newline entity ( ) within an attribute that we are using to paliate
CSS limitations.

I don''t understand your question. First, is not an entity but a
numeric character reference. Second, processing those is independent of
character encodings like UTF-8. Third, I don''t see what CSS limitation
you might be referring to here.

>After the parser has gone through the document, the entity is converted
to \n, which then effectively tosses out the window the behavior we are
getting by keepinig the entity AS IS within the document.

What is "\n" here? What do you mean by "converted"? What do you mean by
keeping it? Processing white-space characters and character references
to them in attribute values is explained in the XML specification. XML
processors keep them to the extent that they are significant. If you
connect the processor to a serializer, the input and output documents
will be canonically equivalent unless one of them has a bug. So there
should be no issue here.
--
Bj?rn H?hrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/




Jean-Fran?ois Michaud wrote:

I''m having a little problem, The UTF-8 parser we are using converts the
newline entity ( ) within an attribute that we are using to paliate
CSS limitations.

is not an entity nor an entity reference, rather a numeric
character reference.
What is an "UTF-8 parser"?

After the parser has gone through the document, the entity is converted
to \n, which then effectively tosses out the window the behavior we are
getting by keepinig the entity AS IS within the document.

It is not clear what kind of tool you use and what you produce finally
but if you want to serialize a DOM or an XSLT result tree to XML markup
and want that newline character to be escaped as as a numeric
character reference then you need an XML serializer that does that. If
you want to serialize such a tree to HTML markup then you need a HTML
serializer that does that.

--

Martin Honnen
http://JavaScript.FAQTs.com/


In article <11**********************@e3g2000cwe.googlegroups. com>,
Jean-Fran?ois Michaud <co*****@comcast.netwrote:

>After the parser has gone through the document, the entity is converted
to \n, which then effectively tosses out the window the behavior we are
getting by keepinig the entity AS IS within the document.

>Is there a clean and easy way around this?

Not using XML. XML applications are effectively required to treat
character references in content the same way that they treat the
characters referred to. A conforming XML parser will convert it in
the way you describe.

If you want to have something that''s like a newline but is treated
differently, then a character reference is not the right approach.
That''s not what they''re for. Using an element such as <nl/might be
a better solution.

-- Richard


这篇关于防止UTF-8 Parser转换实体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆