为什么Java I / O中的字节可以表示字符? [英] Why does a byte in Java I/O can represent a character?

查看:108
本文介绍了为什么Java I / O中的字节可以表示字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么Java I / O中的一个字节可以表示一个字符?

Why does a byte in Java I/O can represent a character?

我看到这些字符只是ASCII。那么它不是动态的,对吗?

And I see the characters are only ASCII. Then it's not dynamic, right?

有没有解释呢?

字节有什么区别流和字符流?

What is the difference between byte streams and character streams?

推荐答案

字节不是字符。

在计算方面,字符是数字代码(或代码序列)与编码字符集的配对,用于定义代码如何映射到真实世界的字符(或对空白或控制代码)。

Computingwise, a "character" is a pairing of a numeric code (or sequence of codes) with an encoding or character set that defines how the codes map to real-world characters (or to whitespace, or to control codes).

只有与编码配对后,字节才能代表字符。对于某些编码(如ASCII或ISO-8859-1),一个字节可以表示一个字符......并且许多编码甚至是ASCII兼容的(这意味着0到127的字符代码与ASCII的定义对齐)。 。但没有原始映射,你不知道你有什么。

Only once paired with an encoding can bytes represent characters. For some encodings (like ASCII or ISO-8859-1), one byte can represent one character...and many encodings are even ASCII-compatible (meaning that the character codes from 0 to 127 align with ASCII's definition for them)...but without the original mapping, you don't know what you have.

没有编码,字节只是8位整数。

你可以按自己喜欢的方式解释它们,甚至可以获得一些可用的东西......但是如果不知道编码,你就不确定它们代表什么。

You can interpret them any way you like, and you might even get something usable...but without knowing the encoding, you don't know for sure what they represent.

它甚至可能不是文本。

例如,考虑字节序列 0x48 0x65 0x6c 0x6c 0x6f 0x2e 。它可以解释为:

For example, consider the byte sequence 0x48 0x65 0x6c 0x6c 0x6f 0x2e. It can be interpreted as:


  • Hello。,ASCII和兼容的8位编码;

  • 晚餐我用一些8位编码来证明这一点;

  • 䡥汬漮 in big-endian UTF-16 * ;

  • 钢蓝RGB中的像素后跟一个灰黄色的像素;

  • 以某种未知处理器的汇编语言加载r101,[0x6c6c6f2e] ;

  • Hello. in ASCII and compatible 8-bit encodings;
  • dinner in some 8-bit encoding i made up just to prove this point;
  • 䡥汬漮 in big-endian UTF-16*;
  • a steel-blue pixel followed by a greyish-yellowish one, in RGB;
  • load r101, [0x6c6c6f2e] in some unknown processor's assembly language;

或其他一百万件事。仅这六个字节无法告诉您哪种解释是正确的。

or any of a million other things. Those six bytes alone can't tell you which interpretation is correct.

至少使用文本,这就是编码的目的。

With text, at least, that's what encodings are for.

但是如果你想要解释是正确的,你需要使用相同的编码来解码那些用于生成它们的字节。这就是了解文本编码方式如此重要的原因。

But if you want the interpretation to be right, you need to use the same encoding to decode those bytes as was used to generate them. That's why it's so important to know how your text was encoded.

字节流和字符流之间的区别是字符流尝试使用字符而不是字节。 (它实际上适用于UTF-16代码单元。但是因为我们知道编码,这对于大多数用途来说已经足够了。)如果它包含在字节流中,则字符流使用编码来转换从基础字节流读取的字节到 char s(或 char 写入到字节的流)。

The difference between a byte stream and a character stream is that the character stream attempts to work with characters rather than bytes. (It actually works with UTF-16 code units. But since we know the encoding, that's good enough for most purposes.) If it's wrapped around a byte stream, the character stream uses an encoding to convert the bytes read from the underlying byte stream to chars (or chars written to the stream to bytes).

*注意:我不知道䡥汬漮是亵渎甚至是否有任何意义......但除非你把它编程为中文,否则它也不是计算机。

这篇关于为什么Java I / O中的字节可以表示字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆