在什么编码是Java字符存储在? [英] In what encoding is a Java char stored in?
问题描述
是否保证Java字符类型存储在任何特定的编码中?
Is the Java char type guaranteed to be stored in any particular encoding?
编辑:我错误地写了这个问题。
I phrased this question incorrectly. What I meant to ask is are char literals guaranteed to use any particular encoding?
推荐答案
存储的字符串在哪里? Java中的所有字符串均为,以UTF-16表示。
"Stored" where? All Strings in Java are represented in UTF-16. When written to a file, sent across a network, or whatever else, it's sent using whatever character encoding you specify.
编辑:具体用于以下操作: char
类型,请参阅字符文档。具体来说:字符数据类型...基于原始的Unicode规范,它将字符定义为固定宽度的16位实体。因此,将 char
转换为 int
将为您提供一个UTF-16值 char
实际上包含该字符集的一个字符。如果你只是将一些随机值插入到 char
中,它显然不一定是一个有效的UTF-16字符,同样,如果你使用错误的编码读取字符。文档继续讨论如何补充的UTF-16字符只能由 int
表示,因为 char
没有足够的空间来容纳它们,如果你在这个级别上操作,熟悉这些语义可能很重要。
Specifically for the char
type, see the Character docs. Specifically: "The char data type ... are based on the original Unicode specification, which defined characters as fixed-width 16-bit entities." Therefore, casting char
to int
will always give you a UTF-16 value if the char
actually contains a character from that charset. If you just poked some random value into the char
, it obviously won't necessarily be a valid UTF-16 character, and likewise if you read the character in using a bad encoding. The docs go on to discuss how the supplementary UTF-16 characters can only be represented by an int
, since char
doesn't have enough space to hold them, and if you're operating at this level, it might be important to get familiar with those semantics.
这篇关于在什么编码是Java字符存储在?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!