未知字节由方法getBytes()返回 [英] unknown bytes is returned by method getBytes()

查看：46 发布时间：2021/5/18 20:30:14 java string unicode

本文介绍了未知字节由方法getBytes()返回的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

 导入java.io.UnsupportedEncodingException;导入java.util.Arrays;公共班级主要{公共静态void main(String [] args){尝试{字符串s ="s";System.out.println(Arrays.toString(s.getBytes("utf8"))));System.out.println(Arrays.toString(s.getBytes("utf16")));System.out.println(Arrays.toString(s.getBytes("utf32")));}捕获(UnsupportedEncodingException e){e.printStackTrace();}}}

控制台:

 [115][-2，-1、0、115][0，0，0，115]

是什么?

[-2，-1]-???

我还指出，如果我这样做:

 String s = new String(new char [] {'\ u1251'});System.out.println(Arrays.toString(s.getBytes("utf8"))));System.out.println(Arrays.toString(s.getBytes("utf16")));System.out.println(Arrays.toString(s.getBytes("utf32")));

控制台:

 [-31，-119，-111][-2，-1、18、81][0，0，18，81]

解决方案

-2，-1是字节顺序标记(BOM-U + FEFF)，它指示以下文本以UTF-16格式编码./p>

您可能会得到这个信息是因为，虽然只有一种UTF8和UTF32编码，但是有两种UTF16编码UTF16LE和UTF16BE，其中16位值中的2个字节以Big-Endian或Little Endian格式存储.

由于返回的值为0xFE xFF，这表明编码为UTF16BE



import java.io.UnsupportedEncodingException;
import java.util.Arrays;

public class Main {
 public static void main(String[] args)
 {
  try 
  {
   String s = "s";
   System.out.println( Arrays.toString( s.getBytes("utf8") ) );
   System.out.println( Arrays.toString( s.getBytes("utf16") ) );
   System.out.println( Arrays.toString( s.getBytes("utf32") ) );
  }  
  catch (UnsupportedEncodingException e) 
  {
   e.printStackTrace();
  }
 }
}

Console:


[115]
[-2, -1, 0, 115]
[0, 0, 0, 115]

What is it?

[-2, -1] - ???

Also, i noted, that if i do that:


String s = new String(new char[]{'\u1251'});
System.out.println( Arrays.toString( s.getBytes("utf8") ) );
System.out.println( Arrays.toString( s.getBytes("utf16") ) );
System.out.println( Arrays.toString( s.getBytes("utf32") ) );

Console:


[-31, -119, -111]
[-2, -1, 18, 81]
[0, 0, 18, 81]

解决方案

The -2, -1 is a Byte Order Mark (BOM - U+FEFF) that indcates that the following text is encoded in UTF-16 format.

You are probably getting this because, while there is only one UTF8 and UTF32 encoding, there are two UTF16 encodings UTF16LE and UTF16BE, where the 2 bytes in the 16-bit value are stored in Big-Endian or Little Endian format.

As the values that come back are 0xFE xFF, this suggests that the encoding is UTF16BE

这篇关于未知字节由方法getBytes()返回的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

未知字节由方法getBytes()返回 [英] unknown bytes is returned by method getBytes()

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

未知字节由方法getBytes()返回 [英] unknown bytes is returned by method getBytes()

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭