符号(@)是否是有效的HTML / XML标记字符? [英] Is the at-sign (@) a valid HTML/XML tag character?
问题描述
我正在使用正则表达式进行一些HTML拆分(是的,我知道,从来没有用正则表达式解析 HTML),但我只是 不幸的是不能使用任何外部库)。我使用正则表达式食谱的正则表达式,它工作很好,除非我遇到了这个问题:
I'm doing some HTML stripping using regular expressions (yes, I know, never parse HTML with regexes, but I'm just stripping it, and I also unfortunately cannot use any external libraries). I'm using a regex from the Regular Expressions Cookbook, and it has worked great, except I just ran into this problem:
在字符串
In the string Bob Saget <bobs@aol.com>
, my regex is matching the email as a tag.
所以我的问题是,是的是,我的正则表达式是 @
签署有效的XML或HTML 标签字符? (我不问是否在一个属性内是有效的;我知道它是)如果不是,我将能够成功地排除它在我的正则表达式。
So my question is, is the @
sign a valid XML or HTML tag character? (I'm not asking whether or not it is valid within an attribute; I know that it is) If it is not, I will be able to successfully exclude it in my regex.
我不知道在哪里看这个。我看了这里,我认为在XML ,在标签中不允许at符号;
I'm not sure where to look this up. I looked here and I think that says that in XML, the at-sign is not allowed in a tag; however, I would appreciate some concrete proof.
推荐答案
再次查看 XML规范:
标记包含:
'<' Name (S Attribute)* S? '>'
名称由以下组成:
NameStartChar (NameChar)*
NameStartChar :
A NameStartChar consists of:
":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
NameChar包含:
A NameChar consists of:
NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
@
符号 U + 0040
因此 @
符号在NameChar或NameStartChar中无效,因此在名称中无效。
So the @
sign is not valid in a NameChar or a NameStartChar, and thus not valid in a Name.
这篇关于符号(@)是否是有效的HTML / XML标记字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!