C#大端UCS-2 [英] C# big-endian UCS-2

查看:185
本文介绍了C#大端UCS-2的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前工作的需求,与客户的系统,我们不对接的项目,所以我们有超过数据如何被发送两种方式进行控制。问题是,在C#中,这似乎并没有对大端用于UCS-2的任何支持,并支持很少务农。 (至于我可以告诉)

The project I'm currently working on needs to interface with a client system that we don't make, so we have no control over how data is sent either way. The problem is that were working in C#, which doesn't seem to have any support for UCS-2 and very little support for big-endian. (as far as i can tell)

我想知道,如果有什么事,我看了看在.NET中,或者一些别人已经提出和发布,我们可以使用。如果没有,我会采取裂缝在编码/它在一个自定义的方法进行解码,如果这甚至有可能。

What I would like to know, is if there's anything i looked over in .net, or something that someone else has made and released that we can use. If not I will take a crack at encoding/decoding it in a custom method, if that's even possible.

不过,感谢您的时间两种方式。

But thanks for your time either way.

编辑: BigEndianUni code的确实的工作c中的字符串正确地去$ C $,问题是在接受其他数据大端,至今使用IPAddress.HostToNetworkOrder()所建议的其他地方已经让我德$串了C $ C一半(Merli酒店?是什么来了,它应该是Merlin33069)

BigEndianUnicode does work to correctly decode the string, the problem was in receiving other data as big endian, so far using IPAddress.HostToNetworkOrder() as suggested elsewhere has allowed me to decode half of the string (Merli? is what comes up and it should be Merlin33069)

林梳理短期code,看看是否theres另一个长度是可变的,我错过

Im combing the short code to see if theres another length variable i missed

解决方案: 工作出了大尾端变量是主要问题后,我回到通过和审查的细节,似乎字符串的长度是在字符计数,而不是字节数发送(UTF又好像一个字符是两个字节)所有我需要做的是它的两倍,并且它的工作。谢谢大家的帮助。

RESOLUTION: after working out that the bigendian variables was the main problem, i went back through and reviewed the details and it seems that the length of the strings was sent in character counts, not byte counts (in utf it would seem a char is two bytes) all i needed to do was double it, and it worked out. thank you all for your help.

推荐答案

编辑:现在我们知道,这个问题是不是在的文本数据的编码的,但在长度的。有几个选项:

Now we know that the problem isn't in the encoding of the text data but in the encoding of the length. There are a few options:

  • 反转字节,然后使用内置的 BitConverter code(我以为是你使用的是什么,现在,这或 BinaryReader在
  • 执行采用重复自己转换增加和移位操作
  • 使用我的 EndianBitConverter EndianBinaryReader 的类的 MiscUtil ,这就像 BitConverter BinaryReader在 ,但让你指定字节顺序。
  • Reverse the bytes and then use the built-in BitConverter code (which I assume is what you're using now; that or BinaryReader)
  • Perform the conversion yourself using repeated "add and shift" operations
  • Use my EndianBitConverter or EndianBinaryReader classes from MiscUtil, which are like BitConverter and BinaryReader, but let you specify the endianness.

您可能会寻找 Encoding.BigEndianUni code 。这就是大端UTF-16编码,这是不严格讲同为UCS-2(正如马克),但应罚款,除非你给它弦乐器,包括BMP(即上述U + FFFF)之外的字符,这是不能被重新presented在UCS-2,但的psented在重新$ P $ UTF-16的

You may be looking for Encoding.BigEndianUnicode. That's the big-endian UTF-16 encoding, which isn't strictly speaking the same as UCS-2 (as pointed out by Marc) but should be fine unless you give it strings including characters outside the BMP (i.e. above U+FFFF), which can't be represented in UCS-2 but are represented in UTF-16.

维基百科页面

旧UCS-2(2字节通用字符集)是一种在单向code标准2.0版取代了UTF-16在1996年7月的 2 它只需使用产生一个固定长度格式code点为16位code单元,产生完全相同的结果为UTF-16的范围内0-0xFFFF所有code点,其中包括已被分配的所有字符的96.9%在那个时候的值。

The older UCS-2 (2-byte Universal Character Set) is a similar character encoding that was superseded by UTF-16 in version 2.0 of the Unicode standard in July 1996.2 It produces a fixed-length format by simply using the code point as the 16-bit code unit and produces exactly the same result as UTF-16 for 96.9% of all the code points in the range 0-0xFFFF, including all characters that had been assigned a value at that time.

我觉得不大可能,客户端系统向您发送的字符,这里有一个区别(这基本上是代理对,这是永久保留的使用反正)。

I find it highly unlikely that the client system is sending you characters where there's a difference (which is basically the surrogate pairs, which are permanently reserved for that use anyway).

这篇关于C#大端UCS-2的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆