使用大查询数据传输作业的最大文件数 [英] Max file count using big query data transfer job

查看:80
本文介绍了使用大查询数据传输作业的最大文件数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的GCP存储桶中有大约54000个文件.当我尝试安排大型查询数据传输作业以将文件从GCP存储桶移动到大型查询时,出现以下错误:

I have about 54 000 files in my GCP bucket. When I try to schedule a big query data transfer job to move files from GCP bucket to big query, I am getting the following error:

错误代码9:超出传输运行限制.最大大小:15.00 TB.最大文件数:10000.找到:大小= 267065994 B(0.00 TB);文件数= 54824.

Error code 9 : Transfer Run limits exceeded. Max size: 15.00 TB. Max file count: 10000. Found: size = 267065994 B (0.00 TB) ; file count = 54824.

我认为最大文件数是1000万.

I thought the max file count was 10 million.

推荐答案

我认为BigQuery传输服务会列出与通配符匹配的所有文件,然后使用该列表加载它们.因此,向bq load ...提供完整列表将达到10,000个URI限制. 这可能是必要的,因为BigQuery传输服务将跳过已加载的文件,因此需要逐一查看它们以决定实际加载的文件.

I think that BigQuery transfer service lists all the files matching the wildcard and then use the list to load them. So it will be same that providing the full list to bq load ... therefore reachinh the 10,000 URIs limit. This is probably necessary because BigQuery transfer service will skip already loaded files, so it needs to look them one by one to decide which to actually load.

我认为您唯一的选择是自己安排工作并将其直接加载到BigQuery中.例如,使用 Cloud Composer 或编写可由

I think that your only option is to schedule a job yourself and load them directly into BigQuery. For example using Cloud Composer or writing a little cloud run service that can be invoked by Cloud Scheduler.

这篇关于使用大查询数据传输作业的最大文件数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆