使用AWS Kinesis Firehose写入S3存储桶中的特定文件夹 [英] Write to a specific folder in S3 bucket using AWS Kinesis Firehose

查看:196
本文介绍了使用AWS Kinesis Firehose写入S3存储桶中的特定文件夹的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够基于数据内部的内容将数据发送到kinesis firehose.例如,如果我发送了以下JSON数据:

I would like to be able to send data sent to kinesis firehose based on the content inside the data. For example if I sent this JSON data:

{
   "name": "John",
   "id": 345
}

我想基于id过滤数据并将其发送到我的s3存储桶的子文件夹,例如:S3://myS3Bucket/345_2018_03_05. Kinesis Firehose或AWS Lambda完全可行吗?

I would like to filter the data based on id and send it to a subfolder of my s3 bucket like: S3://myS3Bucket/345_2018_03_05. Is this at all possible with Kinesis Firehose or AWS Lambda?

我现在唯一想到的方法是诉诸于为每个可能的ID创建一个运动流,并将它们指向相同的存储桶,然后将事件发送到应用程序中的那些流,但是我会希望避免这种情况,因为存在许多可能的ID.

The only way I can think of right now is to resort to creating a kinesis stream for every single one of my possible IDs and point them to the same bucket and then send my events to those streams in my application, but I would like to avoid that since there are many possible IDs.

推荐答案

您可能想使用S3事件通知,该通知在Firehose每次将新文件放入S3存储桶(PUT)时都会触发. S3事件通知应调用您编写的自定义lambda函数,该函数读取S3文件的内容并将其拆分并将其写到单独的存储桶中,请记住,每个S3文件可能包含很多记录,而不是只是一个.

You probably want to use an S3 event notification that gets fired each time Firehose places a new file in your S3 bucket (a PUT); the S3 event notification should call a custom lambda function that you write that reads the contents of the S3 file and splits it up and writes it out to the separate buckets, keeping in mind that each S3 file is likely going to contain many records, not just one.

https://aws.amazon.com/blogs/aws/s3-event-notification/

这篇关于使用AWS Kinesis Firehose写入S3存储桶中的特定文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆