什么是字符编码,为什么我应该麻烦它 [英] What is character encoding and why should I bother with it

查看:175
本文介绍了什么是字符编码,为什么我应该麻烦它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对字符编码的概念非常困惑。

I am quite confused about the concept of character encoding.

什么是Unicode,GBK等?编程语言如何使用它们?

What is Unicode, GBK, etc? How does a programming language use them?

我需要知道他们吗?有没有更简单或更快的编程方式,而不必麻烦自己与他们?

Do I need to bother knowing about them? Is there a simpler or faster way of programming without having to trouble myself with them?

推荐答案

一个字节只能有256个不同的值,即8位。

A byte can only have 256 distinct values, being 8 bits.

由于在字符集中有超过256个字符的字符集,通常不能简单地说每个字符是一个字节。

Since there are character sets with more than 256 characters in the character set one cannot in general simply say that each character is a byte.

因此,必须有描述如何将字符集中的每个字符转换为字节序列的映射。一些字符可能映射到一个字节,但其他字符必须映射到多个字节。

Therefore, there must be mappings that describe how to turn each character in a character set into a sequence of bytes. Some characters might be mapped to a single byte but others will have to be mapped to multiple bytes.

这些映射是编码,因为他们告诉你如何编码字符字节序列。

Those mappings are encodings, because they are telling you how to encode characters into sequences of bytes.

对于Unicode,在非常高的级别,Unicode是试图为每个字符分配一个唯一的数字。显然,这个数字必须比一个字节宽,因为有超过256个字符:) Java使用一个版本的Unicode,其中每个字符被分配一个16位的值(这就是为什么Java字符是16位宽,有整数值从0到65535)。当你得到一个Java字符的字节表示,你必须告诉JVM你想要使用的编码,所以它会知道如何选择字符的字节序列。

As for Unicode, at a very high level, Unicode is an attempt to assign a single, unique number to every character. Obviously that number has to be something wider than a byte since there are more than 256 characters :) Java uses a version of Unicode where every character is assigned a 16-bit value (and this is why Java characters are 16 bits wide and have integer values from 0 to 65535). When you get the byte representation of a Java character, you have to tell the JVM the encoding you want to use so it will know how to choose the byte sequence for the character.

这篇关于什么是字符编码,为什么我应该麻烦它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆