C#XPathDocument使用BOM将字符串解析为XML [英] C# XPathDocument parsing string to XML with BOM

查看:90
本文介绍了C#XPathDocument使用BOM将字符串解析为XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于C#中的代码,我正在使用XPathDocument将字符串解析为XML.

For a code in C#, I am parsing a string to XML using XPathDocument.

该字符串是从SDL Trados Studio检索的,并且取决于所使用的XML(最初是如何创建和加载以进行翻译的),该字符串有时具有

The string is retrieved from SDL Trados Studio and it depends on the XML that is being worked on (how it was originally created and loaded for translations) the string sometimes has a BOM sometimes not.

实际上,"xml"是从源文本和目标文本以及结构元素的片段中解析出来的. xml的文本元素被转义,标记和文本合并在一个string中.因此,如果标记在xliff中包含BOM,则字符串将具有BOM.

The 'xml' is actually parsed from the segments of the source and target text and the structure element. The textual elements are escaped for xml and the markup and text is joined in one string. So if the markup has BOM in the xliff, then the string will have BOM.

我试图实际上解析任何与编码无关的xml.因此,此时,我的解决方案是使用Substring删除BOM.

I am trying to actually parse any of the xmls, independent of encoding. So at this point my solution is to remove the BOM with Substring.

这是我的代码:

//Recreate XML files (extractor returns two string arrays)
string strSourceXML = String.Join("", extractor.TextSrc);
string strTargetXML = String.Join("", extractor.TextTgt);

//strip BOM
strSourceXML = strSourceXML.Substring(strSourceXML.IndexOf("<?"));
strTargetXML = strTargetXML.Substring(strSourceXML.IndexOf("<?"));

//Transform XML with the preview XSL
var xSourceDoc = new XPathDocument(strSourceXML);
var xTargetDoc = new XPathDocument(strTargetXML);

通过诸如此类的几篇文章,我一直在寻找更好的解决方案,但是我没有找到更好的解决方案:

I have searched for a better solution, through several articles, such as these, but I found no better solution yet:

使用C#解析XML

使用C#解析复杂的XML

解析:将字符串转换为XML

XmlReader在UTF-8 BOM上中断

有什么建议可以更优雅地解决这个问题吗?

Any advice to solve this more elegantly?

推荐答案

XPathDocument的构造函数采用String自变量

The constructor of XPathDocument taking a String argument https://msdn.microsoft.com/en-us/library/te0h7f95%28v=vs.110%29.aspx takes a URI with the XML file location. If you have a string with XML markup then use a StringReader over that string e.g.

XPathDocument xSourceDoc;
using (TextReader tr = new StringReader(strSourceXML))
{
  xSourceDoc = new XPathDocument(tr);
}

这篇关于C#XPathDocument使用BOM将字符串解析为XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆