我可以上传bzip2中的文件到存储中,然后在bigquery中使用它们吗? [英] Can I upload files in bzip2 to storage and then use them in bigquery?

查看:126
本文介绍了我可以上传bzip2中的文件到存储中,然后在bigquery中使用它们吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆(每个大文件,每个10GB)文件,格式为 bz2 。我想上传它们,然后对它们执行一些查询。大查询理解bzip,因为它是gzip吗?我应该转换它们吗?上传它们的最佳方法是什么?

I have a bunch of (largish, 10GB each) files in bz2 format. I would like to upload them and then perform some queries on them. Does big query "understand" bzip as it does gzip? Should I convert them? What would be the best way to upload them?

推荐答案

我假设这些文件是CSV或JSON格式。根据BigQuery文档( https://cloud.google.com/bigquery/preparing-数据加载),只支持 gzip 压缩。即使 bz2 被支持,但使用10GB大小的压缩文件并不是一个好主意。问题是,与未压缩的文件不同 - BigQuery将无法将它们拆分为小块,并且必须使用整个10GB文件,这将非常缓慢。

I assume the files are in CSV or JSON format. Per BigQuery documentation (https://cloud.google.com/bigquery/preparing-data-for-loading), only gzip compression is supported. Bit even if bz2 was supported, it wouldn't be a good idea to work with 10GB sized compressed files. The problem is that unlike with uncompressed file - BigQuery won't be able to split them into pieces, and will have to work with entire 10GB file, which will be very slow.

这篇关于我可以上传bzip2中的文件到存储中,然后在bigquery中使用它们吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆