从数据流加载Bigquery表时,我们如何设置maximum_bad_records? [英] How do we set maximum_bad_records when loading a Bigquery table from dataflow?

查看:83
本文介绍了从数据流加载Bigquery表时,我们如何设置maximum_bad_records?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在写入BigqueryIO时,是否可以设置错误记录的最大数量?似乎默认设置为0.

解决方案

目前,不幸的是,我们没有提供直接设置Cloud Dataflow中与BigQueryIO相关的configuration.load.maxBadRecords值的方法. /p>

作为一种解决方法,您应该能够应用自定义的ParDo转换,该转换可以在将不良记录"传递给BigQueryIO.Write之前对其进行过滤.因此,BigQuery不应获得任何不良记录".希望这会有所帮助.

如果控制configuration.load.maxBadRecords的能力对您很重要,欢迎您在我们的GitHub存储库的问题跟踪器.

Is there a way to set the maximum number of bad records when writing to BigqueryIO? It seems to keep the default at 0.

解决方案

At this time, unfortunately, we don't provide a way to directly set the value of configuration.load.maxBadRecords in relation to BigQueryIO in Cloud Dataflow.

As a workaround, you should be able to apply a custom ParDo transform that filters "bad records" before they are passed to BigQueryIO.Write. As a result, BigQuery shouldn't get any "bad records". Hopefully, this helps.

If the ability to control configuration.load.maxBadRecords is important to you, you are welcome to file a feature request in the issue tracker of our GitHub repository.

这篇关于从数据流加载Bigquery表时,我们如何设置maximum_bad_records?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆