C#中的字节数组检测XML编码? [英] c# Detect xml encoding from Byte Array?

查看:205
本文介绍了C#中的字节数组检测XML编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗯,我有一个字节数组,我知道它的字节数组中的XML serilized对象有什么办法可以从它那里得到的编码?

Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it?

我不是要deserilize,但即时通讯在一个XML字段保存在SQL服务器上...所以我需要将其转换为字符串?

Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string?

推荐答案

您可以看看前40上下的字节 1 。他们的的包含文件声明(假设它的的一个文档声明),它都应包含编码的的你可以假设它是UTF-8或UTF-16,这应该应该从你是如何理解<明显;?XML 部分。 (只是检查这两个模式。)

You could look at the first 40-ish bytes1. They should contain the document declaration (assuming it has an document declaration) which should either contain the encoding or you can assume it's UTF-8 or UTF-16, which should should be obvious from how you've understood the <?xml part. (Just check for both patterns.)

实际上,你希望你永远得到比UTF-8或UTF-16以外的任何其他?如果没有,你可以检查你在这两个的开始模式并抛出一个异常,如果它不遵循任何模式。另外,如果你想再次尝试,你总是可以尝试将文档作为UTF-8解码,重新编码它,看看你得到相同的字节回来。它的效果并不理想,但它可能只是工作。

Realistically, do you expect you'll ever get anything other than UTF-8 or UTF-16? If not, you could check for the patterns you get at the start of both of those and throw an exception if it doesn't follow either pattern. Alternatively, if you want to make another attempt, you could always try to decode the document as UTF-8, re-encode it and see if you get the same bytes back. It's not ideal, but it might just work.

我敢肯定有这样做的更严格的办法,但他们很可能是挑剔的:)

I'm sure there are more rigorous ways of doing this, but they're likely to be finicky :)

1 很可能比这少。我想20个字符应该够了,这是UTF-16 40字节。

1 Quite possibly less than this. I figure 20 characters should be enough, which is 40 bytes in UTF-16.

这篇关于C#中的字节数组检测XML编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆