如何解析XML随着节点名称无效字符? [英] How To Parse XML With Invalid Characters in Node Name?

查看:275
本文介绍了如何解析XML随着节点名称无效字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我试图解析一些XML,创建这是不是我的控制之下。麻烦的是,他们已经不知怎么看起来像这样的节点:

So I'm trying to parse some XML, the creation of which is not under my control. The trouble is, they've somehow got nodes that look like this:

<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(MORNINGSTAR) />
<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(QUARTERSTAFF) />
<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(SCYTHE) />
<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(TRATNYR) />
<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(TRIPLE-HEADED_FLAIL) />
<ID_INTERNAL_FEAT_FOCUSED_EXPERTISE_(WARAXE) />

Visual Studio和.NET两个觉得'('和')'字符,如上面使用,是完全无效的。不幸的是,我需要处理这些文件!有没有什么办法让XML阅读器类没有发飙了,在看到这些文字,或者动态地逃脱他们的东西?我可以做一些对整个文件pre处理的,但我想,如果他们出现在某些有效的方式在节点内的'('和')'字,所以我不希望只是删除他们都...

Visual Studio and .NET both feel that the '(' and ')' characters, as used above, are totally invalid. Unfortunately, I need to process these files! Is there any way to get the Xml Reader classes to not freak out at seeing these characters, or dynamically escape them or something? I could do some sort of pre-processing on the whole file, but I DO want the '(' and ')' characters if they appear inside the node in some valid way, so I don't want to just remove them all...

推荐答案

这根本是无效的。 pre-处理是你最好的赌注,也许正则表达式 - 是这样的:

That simply isn't valid. Pre-processing is your best-bet, perhaps with regex - something like:

string output = Regex.Replace(input, @"(<\w+)\((\w+)\)([ >/])", "$1$2$3");

编辑:更复杂一点,以取代 - 括号内:

a bit more complex to replace the "-" inside the brackets:

string output = Regex.Replace(input, @"(<\w+)\(([-\w]+)\)([ >/])",
    delegate(Match match) {
        return match.Groups[1].Value + match.Groups[2].Value.Replace('-', '_')
             + match.Groups[3].Value;
    });

这篇关于如何解析XML随着节点名称无效字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆