使用 boto 从 S3 中逐行读取文件? [英] Read a file line by line from S3 using boto?

查看:28
本文介绍了使用 boto 从 S3 中逐行读取文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 S3 中有一个 csv 文件,我正在尝试读取标题行以获取大小(这些文件是由我们的用户创建的,因此它们几乎可以是任何大小).有没有办法使用 boto 来做到这一点?我想也许我可以使用 python BufferedReader,但我不知道如何从 S3 密钥打开流.任何建议都会很棒.谢谢!

解决方案

看来 boto 有一个 read() 函数可以做到这一点.这是一些对我有用的代码:

<预><代码>>>>进口博托>>>从 boto.s3.key 导入密钥>>>conn = boto.connect_s3('ap-southeast-2')>>>bucket = conn.get_bucket('bucket-name')>>>k = 密钥(桶)>>>k.key = '文件名.txt'>>>k.open()>>>读(10)'本文'

read(n) 的调用从对象返回接下来的 n 个字节.

当然,这不会自动返回标题行",但您可以使用足够大的数字调用它以至少返回标题行.

I have a csv file in S3 and I'm trying to read the header line to get the size (these files are created by our users so they could be almost any size). Is there a way to do this using boto? I thought maybe I could us a python BufferedReader, but I can't figure out how to open a stream from an S3 key. Any suggestions would be great. Thanks!

解决方案

It appears that boto has a read() function that can do this. Here's some code that works for me:

>>> import boto
>>> from boto.s3.key import Key
>>> conn = boto.connect_s3('ap-southeast-2')
>>> bucket = conn.get_bucket('bucket-name')
>>> k = Key(bucket)
>>> k.key = 'filename.txt'
>>> k.open()
>>> k.read(10)
'This text '

The call to read(n) returns the next n bytes from the object.

Of course, this won't automatically return "the header line", but you could call it with a large enough number to return the header line at a minimum.

这篇关于使用 boto 从 S3 中逐行读取文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆