有没有办法在xml中转义CDATA结束令牌? [英] Is there a way to escape a CDATA end token in xml?

查看:141
本文介绍了有没有办法在xml中转义CDATA结束令牌?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有任何方法可以在xml文档的CDATA部分中转义CDATA结束令牌(]]> )。或者,更一般地说,如果在CDATA中使用一些转义序列(但是如果存在的话,我猜想这可能只有逃避开始或结束令牌才有意义)。



基本上,您是否可以将一个开始或结束的令牌嵌入到CDATA中,并告诉解析器不要解释它,而是将其视为另一个字符序列。 >

可能你应该重构你的xml结构或你的代码,如果你发现自己试图这样做,但即使我一直在使用xml每天的最后3年左右,我从来没有这个问题,我想知道是否有可能。只是出于好奇。



编辑:



除了使用html编码...

解决方案

显然,这个问题纯粹是学术性的。幸运的是,它有一个很明确的答案。



你不能逃脱CDATA结束序列。 XML 规范的生产规则20很清楚: / p>

  [20] CData :: =(Char *  - (Char *']]>'Char *))

编辑:此产品规则字面意思是CData部分可能包含任何您想要的序列']]>' 。没有例外。



EDIT2:同一部分也读取:


在CDATA部分中,只有CDEnd字符串被识别作为标记,使得左尖括号和&号可能以其文字形式出现;他们不需要(并且不能)使用& lt; & amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; CDATA部分无法嵌套。


换句话说,不可能使用实体引用,标记或任何其他形式的解释语法。 CDATA部分中唯一解析的文本是]]> ,并终止该部分。



因此,在CDATA部分中不可能转义]]>



EDIT3:同一部分也如下:


2.7 CDATA部分



[定义:CDATA部分可能发生在任何字符数据可能发生的地方;它们被用来转义包含另外被识别为标记的字符的文本块。 CDATA部分以字符串<![CDATA [,以字符串]结尾]>:]


然后,可能会出现任何字符数据可能出现的CDATA部分,包括多个相邻的CDATA部分,而不是单个CDATA部分。这样就可以拆分]]> 令牌,并将其两部分放在相邻的CDATA节中。



ex:

 <![CDATA [某些令牌喜欢]]>可能很困难,< invalid>]]> 

应写为

 <![CDATA [某些令牌喜欢]]]]><![CDATA [可能很困难,< valid>]]> 


I was wondering if there is any way to escape a CDATA end token (]]>) within a CDATA section in an xml document. Or, more generally, if there is some escape sequence for using within a CDATA (but if it exists, I guess it'd probably only make sense to escape begin or end tokens, anyway).

Basically, can you have a begin or end token embedded in a CDATA and tell the parser not to interpret it but to treat it as just another character sequence.

Probably, you should just refactor your xml structure or your code if you find yourself trying to do that, but even though I've been working with xml on a daily basis for the last 3 years or so and I have never had this problem, I was wondering if it was possible. Just out of curiosity.

Edit:

Other than using html encoding...

解决方案

Clearly, this question is purely academic. Fortunately, it has a very definite answer.

You cannot escape a CDATA end sequence. Production rule 20 of the XML specification is quite clear:

[20]    CData      ::=      (Char* - (Char* ']]>' Char*))

EDIT: This product rule literally means "A CData section may contain anything you want BUT the sequence ']]>'. No exception.".

EDIT2: The same section also reads:

Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using "&lt;" and "&amp;". CDATA sections cannot nest.

In other words, it's not possible to use entity reference, markup or any other form of interpreted syntax. The only parsed text inside a CDATA section is ]]>, and it terminates the section.

Hence, it is not possible to escape ]]> within a CDATA section.

EDIT3: The same section also reads:

2.7 CDATA Sections

[Definition: CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string "<![CDATA[" and end with the string "]]>":]

Then there may be a CDATA section anywhere character data may occur, including multiple adjacent CDATA sections inplace of a single CDATA section. That allows it to be possible to split the ]]> token and put the two parts of it in adjacent CDATA sections.

ex:

<![CDATA[Certain tokens like ]]> can be difficult and <invalid>]]> 

should be written as

<![CDATA[Certain tokens like ]]]]><![CDATA[> can be difficult and <valid>]]> 

这篇关于有没有办法在xml中转义CDATA结束令牌?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆