将字符串转换为字节数组时会发生什么 [英] What occurs when a string is converted to a byte array

查看:58
本文介绍了将字符串转换为字节数组时会发生什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这是一个新手类型的问题,但我对此非常理解.

I think that this is a newbie type question but I have quite understood this.

我可以找到许多有关如何将各种语言的字符串转换为字节数组的文章.

I can find many posts on how to convert a string to a byte array in various languages.

我不了解的是,每个字符都发生了什么.我知道屏幕上显示的每个字符都由一个数字表示,例如它的ascii代码.(我们现在可以坚持使用ASCII吗,所以我从概念上讲是这样的:-))

What I do not understand is what is happening at a character by character basis. I understand that each character displayed on the screen is represented by a number such as it's ascii code. (Can we stick with ASCII at the moment so I get this conceptually :-))

这是否意味着当我要表示一个字符或一个字符串(这是图表角色的列表)时,会发生以下情况

Does this mean that when I want to represent a character or a string (which is a list of chartacters) the following occurs

将字符转换为ASCII值>将ascii值表示为二进制吗?

Convert character to ASCII value > represent ascii value as binary?

我看过通过将字节数组定义为输入字符串长度的1/2来创建字节数组的代码,因此确定字节数组与字符串的长度相同吗?

I have seen code that creates Byte arrays by defining the byte array as 1/2 the length of the input string so surely a byte array would be the same length of string?

所以我有点困惑.基本上,我试图将字符串值存储到ColdFusion中的字节数组中,但我看不到字符串数组函数具有显式字符串.

So I am a little confused. Basically I am trying to store a sting value into a byte array in ColdFusion which I cannot see has an explicit string to byte array function.

但是,我可以接触到底层的Java,但是我需要知道理论上发生了什么.

However I can get to the underlying java but I need to know whats happening at the theoretical level.

在此先感谢您是否认为我在狂吠!

Thanks in advance and please tell me nicely if you think I am barking mad !!

Gus

推荐答案

在Java中,字符串存储为16位 char 值的数组.字符串中的每个Unicode字符都作为一个或(很少)两个 char 值存储在数组中.

In Java, strings are stored as an array of 16-bit char values. Each Unicode character in the string is stored as one or (rarely) two char values in the array.

如果要将某些字符串数据存储在 byte 数组中,则需要能够将字符串的Unicode字符转换为字节序列.此过程称为 编码 ,每种方法有几种具有不同的规则和结果.如果两段代码要使用字节数组共享字符串数据,则需要就使用哪种编码达成共识.

If you want to store some string data in a byte array, you will need to be able to convert the string's Unicode characters into a sequence of bytes. This process is called encoding and there are several ways to do it, each with different rules and results. If two pieces of code want to share string data using byte arrays, they need to agree on which encoding is being used.

例如,假设我们有一个要使用 s rel ="nofollow noreferrer"> UTF-8 编码.UTF-8具有便利的属性,如果使用它来编码仅包含ASCII字符的字符串,则输入中的每个字符都将转换为具有该字符ASCII值的单个字节.我们可以将Java字符串转换为Java字节数组,如下所示:

For example, suppose we have a string s that we want to encode using the UTF-8 encoding. UTF-8 has the convenient property that if you use it to encode a string that contains only ASCII characters, every character in the input gets converted to a single byte with that character's ASCII value. We might convert our Java string to a Java byte array as follows:

byte[] bytes = s.getBytes("UTF-8");

字节数组 bytes 现在包含来自 s 的字符串数据,并使用UTF-8编码将其编码为字节.

The byte array bytes now contains the string data from s, encoded into bytes using the UTF-8 encoding.

现在,我们将字节存储或传输到某个地方,另一端的代码想将字节解码到一个Java String 中.它将执行以下操作:

Now, we store or transmit the bytes somewhere, and the code on the other end wants to decode the bytes back into a Java String. It will do something like the following:

String t = new String(bytes, "UTF-8");

假设没有任何问题,字符串 t 现在包含与原始字符串 s 相同的字符串数据.

Assuming nothing went wrong, the string t now contains the same string data as the original string s.

请注意,这两段代码必须就所使用的编码达成共识.如果他们不同意,则生成的字符串可能最终包含垃圾,甚至可能无法在以下位置解码全部.

Note that both pieces of code had to agree on what encoding was being used. If they disagreed, the resulting string might end up containing garbage, or might even fail to decode at all.

这篇关于将字符串转换为字节数组时会发生什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆