有什么方法可以使用"java.util.zip"上传提取的zip文件.使用分段上传(Java高级API)访问AWS-S3 [英] Is there any way to upload extracted zip file using "java.util.zip" to AWS-S3 using multipart upload (Java high level API)

查看:136
本文介绍了有什么方法可以使用"java.util.zip"上传提取的zip文件.使用分段上传(Java高级API)访问AWS-S3的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要使用流而不是lambda的/tmp使用multipart-upload将大文件上传到AWS S3.文件已上传但未完全上传.

Need to upload a large file to AWS S3 using multipart-upload using stream instead of using /tmp of lambda.The file is uploaded but not uploading completely.

在我的情况下,无法预测zip中每个文件的大小,可能是一个文件的大小达到1 Gib.所以我使用ZipInputStream从S3读取,我想将其上传回S3.在lambda上工作时,由于文件太大,我无法将文件保存在lambda的/tmp中.因此,我尝试直接读取并上传到S3,而不使用S3-multipart upload保存在/tmp中.但是我遇到了文件未完全写入的问题,我怀疑文件每次都被覆盖.请查看我的代码和帮助.

In my case the size of each file in zip cannot be predicted, may be a file goes up to 1 Gib of size.So I used ZipInputStream to read from S3 and I want to upload it back to S3.Since I am working on lambda, I cannot save the file in /tmp of lambda due to the large file size.So I tried to read and upload directly to S3 without saving in /tmp using S3-multipart upload. But I faced an issue that the file is not writing completely.I suspect that the file is overwritten every time. Please review my code and help.

public void zipAndUpload {
    byte[] buffer = new byte[1024];
    try{
    File folder = new File(outputFolder);
    if(!folder.exists()){
        folder.mkdir();
    }

    AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();  
    S3Object object = s3Client.getObject("mybucket.s3.com","MyFilePath/MyZip.zip");

    TransferManager tm = TransferManagerBuilder.standard()
            .withS3Client(s3Client)
            .build();

    ZipInputStream zis = 
        new ZipInputStream(object.getObjectContent());
    ZipEntry ze = zis.getNextEntry();

    while(ze!=null){    
    String fileName = ze.getName();
    System.out.println("ZE " + ze + " : " + fileName);

          File newFile = new File(outputFolder + File.separator + fileName);
          if (ze.isDirectory()) {
              System.out.println("DIRECTORY" + newFile.mkdirs());
          }
          else {
              filePaths.add(newFile);
              int len;
              while ((len = zis.read(buffer)) > 0) {

                  ObjectMetadata meta = new ObjectMetadata();
                  meta.setContentLength(len);
                  InputStream targetStream = new ByteArrayInputStream(buffer);

                  PutObjectRequest request = new PutObjectRequest("mybucket.s3.com", fileName, targetStream ,meta); 
                  request.setGeneralProgressListener(new ProgressListener() {
                      public void progressChanged(ProgressEvent progressEvent) {
                          System.out.println("Transferred bytes: " + progressEvent.getBytesTransferred());
                      }
                  });
                  Upload upload = tm.upload(request);
                 }
          }  
           ze = zis.getNextEntry();
    }

       zis.closeEntry();
       zis.close(); 
       System.out.println("Done");  
   }catch(IOException ex){
      ex.printStackTrace(); 
   }
    }

推荐答案

问题是您的内部while循环.基本上,您是从ZipInputStream读取1024个字节并将其上传到S3中.您将一次又一次地覆盖目标密钥,而不是流式传输到S3中.

The problem is your inner while loop. Basically you're reading 1024 bytes from the ZipInputStream and upload those into S3. Instead of streaming into S3, you will overwrite the target key again and again and again.

解决方案稍微复杂一点,因为每个文件没有一个流,而每个zip容器只有一个流.这意味着您无法执行以下操作,因为在第一次上传完成后,流将被AWS关闭

The solution to this is a bit more complex because you don't have one stream per file but one stream per zip container. This means you can't do something like below because the stream will be closed by AWS after the first upload is done

// Not possible
PutObjectRequest request = new PutObjectRequest(targetBucket, name, 
zipInputStream, meta);

您必须将ZipInputStream写入PipedOutputStream对象中-对于每个ZipEntry位置.下面是一个有效的示例

You have to write the ZipInputStream into a PipedOutputStream object - for each of the ZipEntry positions. Below is a working example

import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;

import java.io.*;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

public class Pipes {
    public static void main(String[] args) throws IOException {

        Regions clientRegion = Regions.DEFAULT;
        String sourceBucket = "<sourceBucket>";
        String key = "<sourceArchive.zip>";
        String targetBucket = "<targetBucket>";

        PipedOutputStream out = null;
        PipedInputStream in = null;
        S3Object s3Object = null;
        ZipInputStream zipInputStream = null;

        try {
            AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
                    .withRegion(clientRegion)
                    .withCredentials(new ProfileCredentialsProvider())
                    .build();

            TransferManager transferManager = TransferManagerBuilder.standard()
                    .withS3Client(s3Client)
                    .build();

            System.out.println("Downloading an object");
            s3Object = s3Client.getObject(new GetObjectRequest(sourceBucket, key));
            zipInputStream = new ZipInputStream(s3Object.getObjectContent());

            ZipEntry zipEntry;
            while (null != (zipEntry = zipInputStream.getNextEntry())) {

                long size = zipEntry.getSize();
                String name = zipEntry.getName();
                if (zipEntry.isDirectory()) {
                    System.out.println("Skipping directory " + name);
                    continue;
                }

                System.out.printf("Processing ZipEntry %s : %d bytes\n", name, size);

                // take the copy of the stream and re-write it to an InputStream
                out = new PipedOutputStream();
                in = new PipedInputStream(out);

                ObjectMetadata metadata = new ObjectMetadata();
                metadata.setContentLength(size);

                PutObjectRequest request = new PutObjectRequest(targetBucket, name, in, metadata);

                transferManager.upload(request);

                long actualSize = copy(zipInputStream, out, 1024);
                if (actualSize != size) {
                    throw new RuntimeException("Filesize of ZipEntry " + name + " is wrong");
                }

                out.flush();
                out.close();
            }
        } finally {
            if (out != null) {
                out.close();
            }
            if (in != null) {
                in.close();
            }
            if (s3Object != null) {
                s3Object.close();
            }
            if (zipInputStream != null) {
                zipInputStream.close();
            }
            System.exit(0);
        }
    }

    private static long copy(final InputStream input, final OutputStream output, final int buffersize) throws IOException {
        if (buffersize < 1) {
            throw new IllegalArgumentException("buffersize must be bigger than 0");
        }
        final byte[] buffer = new byte[buffersize];
        int n = 0;
        long count=0;
        while (-1 != (n = input.read(buffer))) {
            output.write(buffer, 0, n);
            count += n;
        }
        return count;
    }
}

这篇关于有什么方法可以使用"java.util.zip"上传提取的zip文件.使用分段上传(Java高级API)访问AWS-S3的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆