使用日期分区将数据流式传输到google bigquery模板表中 [英] Streaming data into google bigquery template table with date partition

查看:234
本文介绍了使用日期分区将数据流式传输到google bigquery模板表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试使用templateSuffix将数据流式传输到Big Query,并使用Java API将日期分区附加到表名称
但是,我收到以下异常:

  com.google.api.client.googleapis.json.GoogleJsonResponseException:400错误请求
{
code:400,
错误:[{
domain:global,
location:suffix,
locationType:other,
message:Table名称应仅包含_,az,AZ或0-9。,
reason:invalid
}],
message:表名只能包含_ ,az,AZ或0-9。
}

我使用的API为:

  String tableName =testTable $ 201701; // 201701 is partition_id 
TableDataInsertAllRequest request = new TableDataInsertAllRequest()
.setIgnoreUnknownValues(true)
.setRows(rows);

//添加模板后缀
request.setTemplateSuffix(templateSuffix);

return bigquery
.tabledata()$ b $ .insertAll(projectId,datasetId,tableName,request)
.execute();

只有templateSuffix或表格上的日期分区才能正常工作。但不是两个在一起。任何想法如何解决这个问题?

解决方案

有两种不同的用例 - 将数据流式传输到每日分片表和流式数据每日分区表

每日日期分片表格是那些每天都有单独表格与天分区表,这是一个表,但内部分区



您可以在任何这些流中进行流式处理。

流式发生 - 表必须存在。因此,为了避免为每一天都创建新表 - 模板表格(如果您有分区表,则不需要)。

因此,您可以流式传输到日分区表的特定分区,也可以使用模板表来分流日分区表。

不是两者同时出现!


I am trying to stream data into Big Query using templateSuffix and a date partition appended to the table name using Java API But I am getting the below exception:

com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"location" : "suffix",
"locationType" : "other",
"message" : "Table name should only contain _, a-z, A-Z, or 0-9.",
"reason" : "invalid"
} ],
"message" : "Table name should only contain _, a-z, A-Z, or 0-9."
}

I am using the API as :

String tableName = "testTable$201701"; // 201701 is partition_id
TableDataInsertAllRequest request = new TableDataInsertAllRequest()
                    .setIgnoreUnknownValues(true)
                    .setRows(rows);

// add a template suffix
request.setTemplateSuffix(templateSuffix);

return bigquery
 .tabledata()
 .insertAll(projectId, datasetId, tableName, request)
 .execute();

Only templateSuffix or only date partition on the table works fine. But not both together. Any ideas how to get this resolved?

解决方案

There are two separate use cases – streaming data to daily sharded tables and streaming data to daily partitioned tables

Daily date-sharded tables are those that have separate table for each day vs. day partitioned table which is one table but partitioned "internally"

You can stream in any of those.
For streaming to happen – table must exist. So to avoid creating new table for each and every new day – template table is used (which is not needed if you have partitioned table).
So, either you stream to specific partition of day partitioned table or you stream to separate daily sharded tables using template table.
Not the both at the same time!

这篇关于使用日期分区将数据流式传输到google bigquery模板表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆