String.length()和String.getBytes()之间的区别。长度 [英] Difference between String.length() and String.getBytes().length

查看:231
本文介绍了String.length()和String.getBytes()之间的区别。长度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Java编程的初学者和自学者。
所以,我想知道 String.length() String.getBytes()之间的区别。长度在Java中。

I am beginner and self-learning in Java programming. So, I want to know about difference between String.length() and String.getBytes().length in Java.

什么更适合检查字符串的长度?

What is more suitable to check the length of the string?

推荐答案

String.length()



String.length()是16的数量用于表示字符串的-bit UTF-16代码单元。也就是说,它是用于表示字符串的 char 值的数量,因此也等于 toCharArray()。length 。对于西方语言中使用的大多数字符,这通常与字符串中的unicode字符(代码点)的数量相同,但是如果任何UTF-16 代理对。仅需要对 BMP 之外的字符进行编码,并且很少使用(表情符号是常见的例外情况)。

String.length()

String.length() is the number of 16-bit UTF-16 code units needed to represent the string. That is, it is the number of char values that are used to represent the string and thus also equal to toCharArray().length. For most characters used in western languages this is typically the same as the number of unicode characters (code points) in the string, but it will be less than the number code units if any UTF-16 surrogate pairs are used. Such pairs are needed only to encode characters outside the BMP and are rarely used in most writing (emoji are a common exception).

String.getBytes()。length 另一方面是表示所需的字节数你的字符串是平台的默认编码。例如,如果默认编码为UTF-16(罕见),则它将是 String.length()返回的值的2倍(因为每个16位代码单元)需要2个字节来表示)。更常见的是,您的平台编码将是一个多字节编码,如UTF-8。

String.getBytes().length on the other hand is the number of bytes needed to represent your string in the platform's default encoding. For example, if the default encoding was UTF-16 (rare), it would be exactly 2x the value returned by String.length() (since each 16-bit code unit takes 2 bytes to represent). More commonly, your platform encoding will be a multi-byte encoding like UTF-8.

这意味着这两个长度之间的关系更复杂。对于ASCII字符串,这两个调用几乎总是会产生相同的结果(在1个字节内不编码ASCII子集的异常默认编码之外)。在ASCII字符串之外, String.getBytes()。length 可能更长,因为它计算表示字符串所需的字节数,而 length( )计算2字节代码单位。

This means the relationship between those two lengths are more complex. For ASCII strings, the two calls will almost always produce the same result (outside of unusual default encodings that don't encode the ASCII subset in 1 byte). Outside of ASCII strings, String.getBytes().length is likely to be longer, as it counts bytes needed to represent the string, while length() counts 2-byte code units.

通常,您将使用 String.length()与其他字符串方法一起使用,这些方法将偏移量转换为字符串。例如,要获得最后一个字符,您将使用 str.charAt(str.length() - 1)。你只使用 getBytes()。length 如果由于某种原因你正在处理 getBytes返回的字节数组编码

Usually you'll use String.length() in concert with other string methods that take offsets into the string. E.g., to get the last character, you'd use str.charAt(str.length()-1). You'd only use the getBytes().length if for some reason you were dealing with the array-of-bytes encoding returned by getBytes.

这篇关于String.length()和String.getBytes()之间的区别。长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆