数据已写入 BigQuery 但格式不正确 [英] Data is written to BigQuery but not in proper format

查看:15
本文介绍了数据已写入 BigQuery 但格式不正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据写入 BigQuery 并成功写入那里.但我关心的是它被写入的格式.

I'm writing data to BigQuery and successfully gets written there. But I'm concerned with the format in which it is getting written.

以下是我在 BigQuery 中执行任何查询时显示数据的格式:

Below is the format in which the data is shown when I execute any query in BigQuery :

检查第一行,SalesComponent 的值是 CPS_H 但它显示 'BeamRecord [dataValues=[CPS_H' 并且在 ModelIteration 中,该值以方形刹车结束.

Check the first row, the value of SalesComponent is CPS_H but its showing 'BeamRecord [dataValues=[CPS_H' and In the ModelIteration the value is ended with a square braket.

下面是用于从 BeamSql 向 BigQuery 推送数据的代码:

Below is the code that is used to push data to BigQuery from BeamSql:

TableSchema tableSchema = new TableSchema().setFields(ImmutableList.of(
    new TableFieldSchema().setName("SalesComponent").setType("STRING").setMode("REQUIRED"),
    new TableFieldSchema().setName("DuetoValue").setType("STRING").setMode("REQUIRED"),
    new TableFieldSchema().setName("ModelIteration").setType("STRING").setMode("REQUIRED")
));

TableReference tableSpec = BigQueryHelpers.parseTableSpec("beta-194409:data_id1.tables_test");
System.out.println("Start Bigquery");
final_out.apply(MapElements.into(TypeDescriptor.of(TableRow.class)).via(
    (MyOutputClass elem) -> new TableRow().set("SalesComponent", elem.SalesComponent).set("DuetoValue", elem.DuetoValue).set("ModelIteration", elem.ModelIteration)))
        .apply(BigQueryIO.writeTableRows()
        .to(tableSpec)
        .withSchema(tableSchema)
        .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
        .withWriteDisposition(WriteDisposition.WRITE_TRUNCATE));

p.run().waitUntilFinish();

编辑

我已使用以下代码将 BeamRecord 转换为 MyOutputClass 类型,但这也不起作用:

I have transformed BeamRecord into MyOutputClass type using below code and this also doesn't work:

 PCollection<MyOutputClass> final_out = join_query.apply(ParDo.of(new DoFn<BeamRecord, MyOutputClass>() {
        private static final long serialVersionUID = 1L;
        @ProcessElement
        public void processElement(ProcessContext c) {
             BeamRecord record = c.element();
               String[] strArr = record.toString().split(",");
            MyOutputClass moc = new MyOutputClass();
            moc.setSalesComponent(strArr[0]);
            moc.setDuetoValue(strArr[1]);
            moc.setModelIteration(strArr[2]);
            c.output(moc);
        }
    }));

推荐答案

我能够使用以下方法解决此问题:

I was able to resolve this issue using below methods :

 PCollection<MyOutputClass> final_out = record40.apply(ParDo.of(new DoFn<BeamRecord, MyOutputClass>() {
        private static final long serialVersionUID = 1L;
        @ProcessElement
        public void processElement(ProcessContext c) throws ParseException {
             BeamRecord record = c.element();
               String strArr = record.toString();
               String strArr1 = strArr.substring(24);
               String xyz = strArr1.replace("]","");
               String[] strArr2 = xyz.split(",");

这篇关于数据已写入 BigQuery 但格式不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆