将控制字符添加到字符串 [英] Add control Character to string

查看:88
本文介绍了将控制字符添加到字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在字符串中添加3个控制字符.以下是代码

I am trying to add 3 control characters to a string. The following is the code

String ackMessage = splitted[0];
                    String[] msh = ackMessage.Split(''|'');
                    String controlID = msh[9];
                    StringBuilder sb = new StringBuilder();
                    sb.Append((byte) 0x0b);
                    sb.AppendLine(ackMessage);
                    sb.Append("MSA|AA|" + controlID);
                    sb.Append((byte) 0x1c);
                    sb.Append((byte) 0x0d);
                    string myMessage = sb.ToString();



我希望sp.append字节命令是实际的ascii字符代码.有人可以告诉我如何在C#中执行此操作吗?

谢谢.



I want the sp.append byte commands to be the actuall ascii character code. Can some tell me how to do this in C#?

Thank you.

推荐答案

确实,如果要处理ASCII,最好不要使用任何字符串-而是使用字节数组.这样,就不会有到Unicode或来自Unicode的尴尬转换.如果您开始在字符串中包含ASCII值,则它们将首先转换为Unicode等效值,这可能会引起比其解决的问题更多的问题.
Truly, if you want to deal with ASCII you are better off not using a string at all - use a byte array instead. That way there are no awkward conversions either to or from Unicode. If you start including ASCII values in your string, they will be converted to the Unicode equivalent first, and that may cause more problems than it fixes.


我没有看到任何问题你做什么.它应该工作.我只怀疑您是否需要字符串中的任何字节命令.如果它们是字节,则将它们保留在字节数组中,而不是字符串中.另外,您应该了解发生了什么,以免出现意外.

字符和字符串不是字节.在内部,它们是Unicode字符,编码为UTF-16.它与ASCII无关,因为ASCII是Unicode的子集,而0x1C实际上表示为UTF-16 0x001C.这种编码将BMP(基本多语言平面)中的所有字符表示为16位单词,并将BMP以外的字符表示为16位单词对,称为代理对;成对的单词来自为此目的而标准化的Unicode代码点的特殊范围.仅当使用基于System.Text.Encoding的类将文本序列化为字节数组时,才支持所有UTF,请参见:
http://msdn.microsoft.com/en-us/library/system.text. encoding.aspx [^ ].

特别是,UTF-8是面向字节的编码,它使用可变数量的字节来表示单个字符.但是,如果您的所有文本均由Unicode的ASCII子集的字符组成,则UTF-8编码将生成与ASCII等效的字节数组,每个字符严格为一个字节.您可以在代码中使用它.

您应该清楚地了解Unicode 不是16位或32位代码.它标准化了被理解为文化实体的字符之间的一对一对应关系,而不管其图形表示形式,字体或整数形式的事物,从其抽象数学意义上理解的事物,无论其计算机表示形式,位大小,诸如此类的东西的终结性.所有计算机表示的细节都由UTF定义.

请参阅:
http://unicode.org/ [ ^ ],
http://unicode.org/faq/utf_bom.html [
I don''t see a problem in what you do. It should work. I only doubt that you need any byte commands in a string. If they are bytes, keep them in an array of byte, not a string. Also, you should understand what going on, to avoid surprises.

Characters and strings are not bytes. Internally, they are Unicode characters encoded as UTF-16. It has nothing to do with ASCII, by as ASCII is a subset of Unicode, 0x1C actually represented as UTF-16 0x001C. This encoding represent all characters in BMP (Base Multilingual Plane) as 16-bit words, and the characters beyond BMP — as pairs of 16-bit words called surrogate pairs; and the words of pairs are from the special ranges of Unicode code points standardized for this purpose. All UTFs are supported only when you serialize text into arrays of byte using the classes based on System.Text.Encoding, please see:
http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx[^].

In particular, UTF-8 is a byte-oriented encoding which takes variable number of bytes to represent a single character. However, if all of your text is composed of characters from the ASCII subset of the Unicode, UTF-8 encoding will produce the array of bytes equivalent to ASCII, strictly one byte per character. You can use it in your code.

You should clearly understand that Unicode is not a 16-bit or 32-bit code. It standardizes one-to-one correspondence between characters understood as cultural entities regardless their graphical representations, fonts or something to integers, understood in their abstract mathematical meaning, regardless their computer representation, bit size, endianess of anything like that. All the details of computer representation are defined by UTFs.

Please see:
http://unicode.org/[^],
http://unicode.org/faq/utf_bom.html[^].

—SA


这篇关于将控制字符添加到字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆