从数据流加载Bigquery表时,我们如何设置maximum_bad_records? [英] How do we set maximum_bad_records when loading a Bigquery table from dataflow?
问题描述
在写入BigqueryIO时,是否可以设置错误记录的最大数量?似乎默认设置为0.
目前,不幸的是,我们没有提供直接设置Cloud Dataflow中与BigQueryIO
相关的configuration.load.maxBadRecords
值的方法. /p>
作为一种解决方法,您应该能够应用自定义的ParDo
转换,该转换可以在将不良记录"传递给BigQueryIO.Write
之前对其进行过滤.因此,BigQuery不应获得任何不良记录".希望这会有所帮助.
如果控制configuration.load.maxBadRecords
的能力对您很重要,欢迎您在我们的GitHub存储库的问题跟踪器.
Is there a way to set the maximum number of bad records when writing to BigqueryIO? It seems to keep the default at 0.
At this time, unfortunately, we don't provide a way to directly set the value of configuration.load.maxBadRecords
in relation to BigQueryIO
in Cloud Dataflow.
As a workaround, you should be able to apply a custom ParDo
transform that filters "bad records" before they are passed to BigQueryIO.Write
. As a result, BigQuery shouldn't get any "bad records". Hopefully, this helps.
If the ability to control configuration.load.maxBadRecords
is important to you, you are welcome to file a feature request in the issue tracker of our GitHub repository.
这篇关于从数据流加载Bigquery表时,我们如何设置maximum_bad_records?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!