为什么" STRING" .getBytes()根据操作系统的不同工作 [英] why does "STRING".getBytes() work different according to the Operation System

查看:149
本文介绍了为什么" STRING" .getBytes()根据操作系统的不同工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我运行下面的code和我正在从some_string.getBytes取决于如果我在Windows或Unix不同的结果()。问题情况与(我试过一个很简单的ABC和同样的问题,任何字符串。

参见下面印在控制台的差异。

下code是用Java 7.如果你把它复制完全它将运行良好测试。

此外,请参见下面的两张图片以十六进制的差异。前两个图像显示在Windows中创建的文件。您可以分别看到ANSI和EBCDIC的十六进制值。第三个形象,黑色的,是从Unix。你可以看到十六进制(-c选项)和字符阅读中,我相信这是EBCDIC。

所以,我直接的问题是:为什么这样code的工作不同,因为我只是在这两个案例中使用Java 7?我应该检查某处任何especific财产?也许,Java的Windows中得到一定的默认格式,在Unix中它得到另一个。如果是这样,这属性必须我检查或settup?

在这里输入的形象描述

Unix的控制台:

  $ ./java -cp /usr/test.jar test.mainframe.read.test.TestGetBytes
H = 76 L -
<未发现

Windows控制台:

  H = 60  - <
H1 = 69 - 电子
H2 = 79 - Ø
H3 = 77 - 中号
H4 = 62 - >
邮件结尾处找到

整个code:

 包test.mainframe.read.test;进口的java.util.ArrayList;公共类TestGetBytes {       公共静态无效的主要(字串[] args){
              尝试{
                     ArrayList的ipmMessage =新的ArrayList();
                     ipmMessage.add(行());                     // Windows路径
                     的WriteMessage(C:/temp/test_bytes.ipm,ipmMessage);
                     reformatFile(C:/temp/test_bytes.ipm);
                     // UNIX路径
                     //writeMessage(\"/usr/temp/test_bytes.ipm,ipmMessage);
                     //reformatFile(\"/usr/temp/test_bytes.ipm);
              }赶上(例外五){                     的System.out.println(e.getMessage());
              }
       }       公共静态的byte []行(){
              返回<&EOM GT;的getBytes()。
       }       公共静态无效的WriteMessage(字符串文件名,ArrayList的ipmMessage)
                     抛出java.io.FileNotFoundException,java.io.IOException异常{              java.io.DataOutputStream中的DOS =新java.io.DataOutputStream中(
                           新java.io.FileOutputStream中(文件名,真实));
              的for(int i = 0; I< ipmMessage.size();我++){
                     尝试{
                           INT [] intValues​​ =(INT [])ipmMessage.get(ⅰ);
                           对于(INT J = 0; J< intValues​​.length; J ++){
                                  dos.write(intValues​​ [J]);
                           }
                     }赶上(抛出ClassCastException E){
                           字节[] = byteValues​​(字节[])ipmMessage.get(I)
                           dos.write(byteValues​​);
                     }
              }
              dos.flush();
              dos.close();       }       //重新格式化为U1014
       公共静态无效reformatFile(字符串文件名)
                     抛出java.io.FileNotFoundException,java.io.IOException异常{
              java.io.FileInputStream中FIS =新java.io.FileInputStream中(文件名);
              java.io.DataInputStream中BR =新java.io.DataInputStream中(FIS);              INT H = br.read();
              的System.out.println(H =+ H + - +(char)的H);              如果((炭)H =='<'){//检查<&EOM GT;                     INT H1 = br.read();
                     的System.out.println(H1 =+ H1 + - +(char)的H1);
                     INT H2 = br.read();
                     的System.out.println(H2 =+ H2 + - +(char)的H2);
                     INT H3 = br.read();
                     的System.out.println(H3 =+ H3 + - +(char)的H3);
                     INT H4 = br.read();
                     的System.out.println(H4 =+ H4 + - +(char)的H4);
                     如果((char)的H1 =='E'和;及(char)的H2 =='O'和;及(炭)H3 =='M'
                                  &功放;&安培; (焦)H4 =='>'){
                           的System.out.println(消息的结尾处找到);
                     }
                     其他{
                           的System.out.println(EOM没有找到,但<发现);
                     }
              }
              其他{
                     的System.out.println(<未找到);
              }
       }
}


解决方案

呼叫时,您没有指定一个字符集的getBytes(),所以它使用的默认字符集基础平台(或Java,如果启动的Java时指定本身)。这是在 字符串<规定/ code>文档


  

字节公众[]的getBytes()


  
  

恩codeS将此String字节序列使用平台的默认字符集,并将结果存储到一个新的字节数组。


的getBytes()有一个重载的版本,让您在code指定字符集。


  

字节公众[]的getBytes(字符集字符集)


  
  

恩codeS将此String字节序列使用给定的charset ,并将结果存储到一个新的字节数组。


I am running the code below and I am getting different outcome from "some_string".getBytes() depending if I am in Windows or Unix. The issue happens with any string (I tried a very simple ABC and same problem.

See the differences below printed in console.

The code below is well-tested using Java 7. If you copy it entirely it will run.

Additionally, see the difference in Hexadecimal in the two images below. The first two images shows the file created in Windows. You can see the hexadecimal values with ANSI and EBCDIC respectively. The third image, the black one, is from Unix. You can see the hexadecimal (-c option) and the character readable in which I believe it is EBCDIC.

So, my straight question is: why does such code work different since I am just using Java 7 in both case? Should I check any especific property in somewhere? Maybe, Java in Windows get certain default format and in Unix it get another. If so, which property must I check or settup?

Unix Console:

$ ./java -cp /usr/test.jar test.mainframe.read.test.TestGetBytes
H = 76 - L
< wasn't found

Windows Console:

H = 60 - <
H1 = 69 - E
H2 = 79 - O
H3 = 77 - M
H4 = 62 - >
End of Message found

The entire code:

package test.mainframe.read.test;

import java.util.ArrayList;

public class TestGetBytes {

       public static void main(String[] args) {
              try {
                     ArrayList ipmMessage = new ArrayList();
                     ipmMessage.add(newLine());

                     //Windows Path
                     writeMessage("C:/temp/test_bytes.ipm", ipmMessage);
                     reformatFile("C:/temp/test_bytes.ipm");
                     //Unix Path
                     //writeMessage("/usr/temp/test_bytes.ipm", ipmMessage);
                     //reformatFile("/usr/temp/test_bytes.ipm");
              } catch (Exception e) {

                     System.out.println(e.getMessage());
              }
       }

       public static byte[] newLine() {
              return "<EOM>".getBytes();
       }

       public static void writeMessage(String fileName, ArrayList ipmMessage)
                     throws java.io.FileNotFoundException, java.io.IOException {

              java.io.DataOutputStream dos = new java.io.DataOutputStream(
                           new java.io.FileOutputStream(fileName, true));
              for (int i = 0; i < ipmMessage.size(); i++) {
                     try {
                           int[] intValues = (int[]) ipmMessage.get(i);
                           for (int j = 0; j < intValues.length; j++) {
                                  dos.write(intValues[j]);
                           }
                     } catch (ClassCastException e) {
                           byte[] byteValues = (byte[]) ipmMessage.get(i);
                           dos.write(byteValues);
                     }
              }
              dos.flush();
              dos.close();

       }

       // reformat to U1014
       public static void reformatFile(String filename)
                     throws java.io.FileNotFoundException, java.io.IOException {
              java.io.FileInputStream fis = new java.io.FileInputStream(filename);
              java.io.DataInputStream br = new java.io.DataInputStream(fis);

              int h = br.read();
              System.out.println("H = " + h + " - " + (char)h);

              if ((char) h == '<') {// Check for <EOM>

                     int h1 = br.read();
                     System.out.println("H1 = " + h1 + " - " + (char)h1);
                     int h2 = br.read();
                     System.out.println("H2 = " + h2 + " - " + (char)h2);
                     int h3 = br.read();
                     System.out.println("H3 = " + h3 + " - " + (char)h3);
                     int h4 = br.read();
                     System.out.println("H4 = " + h4 + " - " + (char)h4);
                     if ((char) h1 == 'E' && (char) h2 == 'O' && (char) h3 == 'M'
                                  && (char) h4 == '>') {
                           System.out.println("End of Message found");
                     }
                     else{
                           System.out.println("EOM not found but < was found");
                     }
              }
              else{
                     System.out.println("< wasn't found");
              }
       }
}

解决方案

You are not specifying a charset when calling getBytes(), so it uses the default charset of the underlying platform (or of Java itself if specified when Java is started). This is stated in the String documentation:

public byte[] getBytes()

Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

getBytes() has an overloaded version that lets you specify a charset in your code.

public byte[] getBytes(Charset charset)

Encodes this String into a sequence of bytes using the given charset, storing the result into a new byte array.

这篇关于为什么&QUOT; STRING&QUOT; .getBytes()根据操作系统的不同工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆