在Base64 Java中对文件进行编码失败 [英] Failure encoding files in base64 java

查看:115
本文介绍了在Base64 Java中对文件进行编码失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有此类可以对文件进行编码和解码.当我使用.txt文件运行该类时,结果将成功.但是,当我使用.jpg或.doc运行代码时,无法打开文件,或者它不等于原始文件.我不知道为什么会这样.我已经修改了这堂课 http://myjeeva.com/convert -image-to-string-and-string-to-image-in-java.html .但是我想改变这条线

I have this class to encode and decode a file. When I run the class with .txt files the result is successfully. But when I run the code with .jpg or .doc I can not open the file or it is not equals to original. I don’t know why this is happening. I have modified this class http://myjeeva.com/convert-image-to-string-and-string-to-image-in-java.html. But i want change this line

byte imageData[] = new byte[(int) file.length()];

对于

byte example[] = new byte[1024];

并根据需要多次读取文件.谢谢.

and read the file so many times how we need. Thanks.

import java.io.*;
import java.util.*;

  public class Encode {

输入=输入文件根-输出=输出文件根-imageDataString =编码的字符串

Input = Input file root - Output = Output file root - imageDataString =String encoded

  String input;
  String output;
  String imageDataString;


  public void setFileInput(String input){
    this.input=input;
  }

  public void setFileOutput(String output){
    this.output=output;
  }

  public String getFileInput(){
    return input;
  }

  public String getFileOutput(){
    return output;
  }

  public String getEncodeString(){
    return  imageDataString;
  }

  public String processCode(){
    StringBuilder sb= new StringBuilder();

    try{
        File fileInput= new File( getFileInput() );
        FileInputStream imageInFile = new FileInputStream(fileInput);

我在示例中看到人们创建了一个byte [],其长度与文件的长度相同.我不希望这样,因为我将不知道文件的长度.

i have seen in examples that people create a byte[] with the same length than the file. I don´t want this because i will not know what length will have the file.

        byte buff[] = new byte[1024];

        int r = 0;

        while ( ( r = imageInFile.read( buff)) > 0 ) {

          String imageData = encodeImage(buff);

          sb.append( imageData);

          if ( imageInFile.available() <= 0 ) {
            break;
          }
        }



       } catch (FileNotFoundException e) {
        System.out.println("File not found" + e);
      } catch (IOException ioe) {
        System.out.println("Exception while reading the file " + ioe);

    } 

        imageDataString = sb.toString();

       return imageDataString;
}  


  public  void processDecode(String str) throws IOException{

      byte[] imageByteArray = decodeImage(str);
      File fileOutput= new File( getFileOutput());
      FileOutputStream imageOutFile = new FileOutputStream( fileOutput);

      imageOutFile.write(imageByteArray);
      imageOutFile.close();

}

 public static String encodeImage(byte[] imageByteArray) {

      return  Base64.getEncoder().withoutPadding().encodeToString( imageByteArray);

    }

    public static byte[] decodeImage(String imageDataString) {
      return  Base64.getDecoder().decode(  imageDataString);  

    }


  public static void main(String[] args) throws IOException {

    Encode a = new Encode();

    a.setFileInput( "C://Users//xxx//Desktop//original.doc");
    a.setFileOutput("C://Users//xxx//Desktop//original-copied.doc");

    a.processCode( );

    a.processDecode( a.getEncodeString());

    System.out.println("C O P I E D");
  }
}

我尝试更改

String imageData = encodeImage(buff);

对于

String imageData = encodeImage(buff,r);

和方法encodeImage

and the method encodeImage

public static String encodeImage(byte[] imageByteArray, int r) {

     byte[] aux = new byte[r];

     for ( int i = 0; i < aux.length; i++) {
       aux[i] = imageByteArray[i];

       if ( aux[i] <= 0 ) {
         break;
       }
     }
return  Base64.getDecoder().decode(  aux);
}

但是我有错误:

Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits   

推荐答案

程序中有两个问题.

第一个,如@Joop Eggen所提到的,是您没有正确处理输入.

The first, as mentioned in by @Joop Eggen, is that you are not handling your input correctly.

实际上,Java不能保证即使在文件中间,您也将读取整个1024个字节.它可能只读取50个字节,并告诉您读取了50个字节,然后下次它将读取50个字节以上.

In fact, Java does not promise you that even in the middle of the file, you'll be reading the entire 1024 bytes. It could just read 50 bytes, and tell you it read 50 bytes, and then the next time it will read 50 bytes more.

假设您在上一轮中读取了1024个字节.现在,在当前的回合中,您仅读取50.您的字节数组现在包含50个新字节,其余是上一次读取的旧字节!

Suppose you read 1024 bytes in the previous round. And now, in the current round, you're only reading 50. Your byte array now contains 50 of the new bytes, and the rest are the old bytes from the previous read!

因此,您始终需要将复制的确切字节数复制到新数组中,并将其传递给编码函数.

So you always need to copy the exact number of bytes copied to a new array, and pass that on to your encoding function.

因此,要解决此特定问题,您需要执行以下操作:

So, to fix this particular problem, you'll need to do something like:

 while ( ( r = imageInFile.read( buff)) > 0 ) {

      byte[] realBuff = Arrays.copyOf( buff, r );

      String imageData = encodeImage(realBuff);

      ...
 }


但是,这不是这里唯一的问题.您真正的问题是Base64编码本身.


However, this is not the only problem here. Your real problem is with the Base64 encoding itself.

Base64的作用是将您的字节分成6位的块,然后将这些块中的每一个都视为N 0到63之间的数字.然后从其字符表中获取第N个字符来表示该块

What Base64 does is take your bytes, break them into 6-bit chunks, and then treat each of those chunks as a number between N 0 and 63. Then it takes the Nth character from its character table, to represent that chunk.

但这意味着它不能只编码一个字节或两个字节,因为一个字节包含8位,这意味着1个块的6位和2个剩余位.两个字节有16位.那就是2个6位的块和4个剩余的位.

But this means it can't just encode a single byte or two bytes, because a byte contains 8 bits, and which means one chunk of 6 bits, and 2 leftover bits. Two bytes have 16 bits. Thats 2 chunks of 6 bits, and 4 leftover bits.

为解决此问题,Base64始终对3个连续字节进行编码.如果输入没有被三等分,它会添加额外的零位.

To solve this problem, Base64 always encodes 3 consecutive bytes. If the input does not divide evenly by three, it adds additional zero bits.

这是一个演示该问题的小程序:

Here is a little program that demonstrates the problem:

package testing;

import java.util.Base64;

public class SimpleTest {

    public static void main(String[] args) {

        // An array containing six bytes to encode and decode.
        byte[] fullArray = { 0b01010101, (byte) 0b11110000, (byte)0b10101010, 0b00001111, (byte)0b11001100, 0b00110011 };

        // The same array broken into three chunks of two bytes.

        byte[][] threeTwoByteArrays = {
            {       0b01010101, (byte) 0b11110000 },
            { (byte)0b10101010,        0b00001111 },
            { (byte)0b11001100,        0b00110011 }
        };
        Base64.Encoder encoder = Base64.getEncoder().withoutPadding();

        // Encode the full array

        String encodedFullArray = encoder.encodeToString(fullArray);

        // Encode the three chunks consecutively 

        StringBuilder encodedStringBuilder = new StringBuilder();
        for ( byte [] twoByteArray : threeTwoByteArrays ) {
            encodedStringBuilder.append(encoder.encodeToString(twoByteArray));
        }
        String encodedInChunks = encodedStringBuilder.toString();

        System.out.println("Encoded full array: " + encodedFullArray);
        System.out.println("Encoded in chunks of two bytes: " + encodedInChunks);

        // Now  decode the two resulting strings

        Base64.Decoder decoder = Base64.getDecoder();

        byte[] decodedFromFull = decoder.decode(encodedFullArray);   
        System.out.println("Byte array decoded from full: " + byteArrayBinaryString(decodedFromFull));

        byte[] decodedFromChunked = decoder.decode(encodedInChunks);
        System.out.println("Byte array decoded from chunks: " + byteArrayBinaryString(decodedFromChunked));
    }

    /**
     * Convert a byte array to a string representation in binary
     */
    public static String byteArrayBinaryString( byte[] bytes ) {
        StringBuilder sb = new StringBuilder();
        sb.append('[');
        for ( byte b : bytes ) {
            sb.append(Integer.toBinaryString(Byte.toUnsignedInt(b))).append(',');
        }
        if ( sb.length() > 1) {
            sb.setCharAt(sb.length() - 1, ']');
        } else {
            sb.append(']');
        }
        return sb.toString();
    }
}

因此,假设我的6字节数组是您的图像文件.想象一下,您的缓冲区每次读取的不是2个字节,而是1024个字节.这将是编码的输出:

So, imagine my 6-byte array is your image file. And imagine that your buffer is not reading 1024 bytes but 2 bytes each time. This is going to be the output of the encoding:

Encoded full array: VfCqD8wz
Encoded in chunks of two bytes: VfAqg8zDM

如您所见,完整数组的编码为我们提供了8个字符.每组三个字节转换为四个6位的块,然后依次转换为四个字符.

As you can see, the encoding of the full array gave us 8 characters. Each group of three bytes is converted into four chunks of 6 bits, which in turn are converted into four characters.

但是三个2字节数组的编码为您提供了9个字符的字符串.这是一个完全不同的字符串!通过用零填充将每组两个字节扩展为三个6位的块.而且由于您不要求填充,所以它仅生成3个字符,而没有多余的=,通常当字节数不能被3整除时,该=会被标记.

But the encoding of the three two-byte arrays gave you a string of 9 characters. It's a completely different string! Each group of two bytes was extended to three chunks of 6 bits by padding with zeros. And since you asked for no padding, it produces only 3 characters, without the extra = that usually marks when the number of bytes is not divisible by 3.

程序中解码8个字符的部分的输出,正确的编码字符串是可以的:

The output from the part of the program that decodes the 8-character, correct encoded string is fine:

Byte array decoded from full: [1010101,11110000,10101010,1111,11001100,110011]

但是尝试解码9个字符,错误的编码字符串的结果是:

But the result from attempting to decode the 9-character, incorrect encoded string is:

Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits
    at java.util.Base64$Decoder.decode0(Base64.java:734)
    at java.util.Base64$Decoder.decode(Base64.java:526)
    at java.util.Base64$Decoder.decode(Base64.java:549)
    at testing.SimpleTest.main(SimpleTest.java:34)

不好!一个好的base64字符串应始终具有4个字符的倍数,而我们只有9个.

Not good! A good base64 string should always have multiples of 4 characters, and we only have 9.

由于您选择的缓冲区大小为1024,而不是3的倍数,因此将发生 问题.您每次需要对3个字节的倍数进行编码,以生成正确的字符串.因此,实际上,您需要创建大小为3072或类似大小的缓冲区.

Since you chose a buffer size of 1024, which is not a multiple of 3, that problem will happen. You need to encode a multiple of 3 bytes each time to produce the proper string. So in fact, you need to create a buffer sized 3072 or something like that.

但是由于第一个问题,在传递给编码器时要非常小心.因为总是会发生您将读取少于3072个字节的情况.然后,如果该数字不能被三整除,则会出现相同的问题.

But because of the first problem, be very careful at what you pass to the encoder. Because it can always happen that you'll be reading less than 3072 bytes. And then, if the number is not divisible by three, the same problem will occur.

这篇关于在Base64 Java中对文件进行编码失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆