如何在S3上获取文件的前100行? [英] How to get the first 100 lines of a file on S3?

查看:75
本文介绍了如何在S3上获取文件的前100行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Amazon S3上有一个巨大的文件(〜6 GB),想要获得文件的前100行而不需要下载整个文件.这可能吗?

I have a huge (~6 GB) file on Amazon S3 and want to get the first 100 lines of it without having to download the whole thing. Is this possible?

这就是我现在正在做的事情:

Here's what I'm doing now:

aws cp s3://foo/bar - | head -n 100

但这需要一段时间才能执行.我很困惑-一旦读取足够的行,head是否不应该关闭管道,从而导致aws cp在有时间下载整个文件之前因BrokenPipeError而崩溃?

But this takes a while to execute. I'm confused -- shouldn't head close the pipe once it's read enough lines, causing aws cp to crash with a BrokenPipeError before it has time to download the entire file?

推荐答案

在GET请求中使用Range HTTP标头,您可以在Amazon S3中存储的对象中检索特定范围的字节. (请参阅 http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html )

Using the Range HTTP header in a GET request, you can retrieve a specific range of bytes in an object stored in Amazon S3. (see http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html)

如果您使用aws cli,则可以使用aws s3api get-object --range bytes=0-xxx,请参见

if you use aws cli you can use aws s3api get-object --range bytes=0-xxx, see http://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html

它与行数不完全相同,但应允许您部分检索文件,从而避免下载完整的对象

It is not exactly as a number of lines but should allow you to retrieve your file in part so avoid downloading the full object

这篇关于如何在S3上获取文件的前100行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆