如何使用Boto3按上次修改日期过滤s3对象 [英] How to filter s3 objects by last modified date with Boto3
问题描述
是否有办法按boto3中的最后修改日期过滤s3对象?我已经构建了一个存储桶中所有内容的大型文本文件列表.一段时间过去了,我只想列出上次遍历整个存储桶之后添加的对象.
Is there a way to filter s3 objects by last modified date in boto3? I've constructed a large text file list of all the contents in a bucket. Some time has passed and I'd like to list only objects that were added after the last time I looped through the entire bucket.
我知道我可以使用Marker
属性从某个对象名称开始,因此可以给它我在文本文件中处理的最后一个对象,但这不能保证在该对象之前没有添加新对象姓名.例如如果文本文件中的最后一个文件是oak.txt,并添加了一个名为apple.txt的新文件,则它将不会执行该操作.
I know I can use the Marker
property to start from a certain object name,so I could give it the last object I processed in the text file but that does not guarantee a new object wasn't added before that object name. e.g. if the last file in the text file was oak.txt and a new file called apple.txt was added, it would not pick that up.
s3_resource = boto3.resource('s3')
client = boto3.client('s3')
def list_rasters(bucket):
bucket = s3_resource.Bucket(bucket)
for bucket_obj in bucket.objects.filter(Prefix="testing_folder/"):
print bucket_obj.key
print bucket_obj.last_modified
推荐答案
以下代码段获取特定文件夹下的所有对象,并检查是否在指定的时间之后创建了上次修改的文件:
The following code snippet gets all objects under specific folder and check if the file last modified is created after the time you specify :
将YEAR,MONTH, DAY
替换为您的值.
import boto3
import datetime
#bucket Name
bucket_name = 'BUCKET NAME'
#folder Name
folder_name = 'FOLDER NAME'
#bucket Resource
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)
def lambda_handler(event, context):
for file in bucket.objects.filter(Prefix= folder_name):
#compare dates
if (file.last_modified).replace(tzinfo = None) > datetime.datetime(YEAR,MONTH, DAY,tzinfo = None):
#print results
print('File Name: %s ---- Date: %s' % (file.key,file.last_modified))
这篇关于如何使用Boto3按上次修改日期过滤s3对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!