使用AWS Lambda从AWS S3中读取并提取巨大的zip文件 [英] Read and extract huge zip file from AWS S3 with AWS Lambda

查看:595
本文介绍了使用AWS Lambda从AWS S3中读取并提取巨大的zip文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个数据管理应用程序,客户端可以在AWS S3上上传具有多个text files(approx 1500 MB)zip file(approx 250 MB).

I am working on data management application where client can upload zip file(approx 250 MB) with multiple text files(approx 1500 MB) on AWS S3.

但是由于aws lamda的内存有限(最大1536MB大小),我能够使用(500 MB)的提取文件来提取(50 MB)的zip文件.

But due to limited memory of aws lamda (max 1536MB size) I am able to extract zip file of (50 MB) with extracted files of (500 MB).

因为我需要在提取时对提取的文件添加一些验证,然后我必须将文件的所有内容存储到数据库中.

Since I need to add some validation on extracted files while extracting and after that I have to store all contents of file in to database.

目前,我将文件内容存储在aws-lambda tmp location中,该文件的最大使用限制为500MB.

For now I am storing the content of files in aws-lambda tmp location which also has limitation of max 500MB can use.

任何可以帮助我完成上述任务并进行验证的流媒体概念对我来说都是有帮助的.

any streaming concept which can help to do my above task with validation will be helpful for me.

我可以使用EC2, ECS,但是现在我只想使用AWS-Lambda.

I can go with EC2, ECS but right now I want to do with only AWS-Lambda.

使用此代码,我将zip文件提取并上传到另一个S3存储桶.

With this code I am extracting and uploading the zip files to another S3 bucket.

诸如流媒体之类的任何其他概念对我来说都是有帮助的,因为我对流媒体概念不太熟悉,我在这里提出一些想法来解决我的问题.

Any other concept like streaming will be helpful for me as I am not much familiar with streaming concept I am putting here to get some idea to resolve my issue.

s3.getObject(params, (err, data) => {
    if (err) {
        console.log('Error', err);
        var message = `Error getting object ${key} from bucket ${bucket}. Make sure they exist and your bucket is in the same region as this function.`;
        console.log(message);
        //  callback(message);
    } else {
        console.log('Started to buffer data');
        JSZip.loadAsync(data.Body).then(function(zip) {
            fs.writeFile('temp/hello.txt', 'New file added for testing');
            async.each(zip.files, function(item, cb1) {
                if (!item.dir && item.name.includes('nightly')) {
                    zip.file(item.name).async("text").then(function(content) {
                        fs.writeFile('temp/' + item.name.replace(/^.*[\\\/]/, ''), content, function(err) {
                            if (err) throw err;
                            cb1();
                        });
                    });
                } else {
                    cb1();
                }
            }, function(err, result) {
                var zipObj = new JSZip();
                fs.readdir('./temp', function(err, files) {
                    console.log(files);
                    async.each(files, function(file, cb2) {
                        fs.readFile('./temp/' + file, 'utf-8', function(err, content) {
                            if (err) {
                                return err;
                            }
                            zipObj.file(file, content);
                            cb2();
                        });
                    }, function(err) {
                        zipObj.generateAsync({
                                type: "nodebuffer"
                            })
                            .then(function(content) {
                                console.log(content);
                                deleteFiles(['./temp/*'], function(err, paths) {
                                    console.log('Deleted files/folders:\n', paths.join('\n'));
                                });

                                s3.putObject({
                                    Bucket: 'abtempb',
                                    Key: 'temp/records.zip',
                                    Body: content
                                }, function(err, result) {
                                    if (result && result.ETag) {
                                        console.log('uploaded file: ', result.ETag);
                                    }
                                    console.log('Error ', err);
                                });
                            });
                    });
                });
            });
        });
    }
});

谢谢

Thank You

推荐答案

您现在可以在Lambda上装载EFS卷.可以在这里.

You can now mount EFS volumes on Lambda. Details can be found here.

这篇关于使用AWS Lambda从AWS S3中读取并提取巨大的zip文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆