流式缓冲区 - Google BigQuery [英] Streaming buffer - Google BigQuery

查看：37 发布时间：2021/12/30 23:17:38 python google-bigquery google-cloud-dataflow

本文介绍了流式缓冲区 - Google BigQuery的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在开发一个像 Google Dataflow 模板一样使用的 Python 程序.

I'm developing a python program to use like Google Dataflow template.

我正在做的是从 PubSub 在 BigQuery 中写入数据:

What I'm doing is writing the data in BigQuery from PubSub:

 pipeline_options.view_as(StandardOptions).streaming = True
    p = beam.Pipeline(options=pipeline_options)

    (p
     # This is the source of the pipeline.
     | 'Read from PubSub' >> beam.io.ReadFromPubSub('projects/.../topics/...')
     #<Transformation code if needed>
     # Destination
     | 'String To BigQuery Row' >> beam.Map(lambda s: dict(Trama=s))
     | 'Write to BigQuery' >> beam.io.Write(
                beam.io.BigQuerySink(
                    known_args.output,
                    schema='Trama:STRING',
                    create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
                    write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND
                ))
     )
    p.run().wait_until_finish()

代码在本地运行，尚未在 Google Dataflow 中运行

The code is running in local, not in Google Dataflow yet

这有效"但不是我想要的方式，因为当前数据存储在 BigQuery Buffer Stream 中，我看不到它(即使等待一段时间后).

This "works" but not the way i want, because currently the data are stored in the BigQuery Buffer Stream and I can not see it (even after waiting some time).

BigQuery 何时可用?为什么存储在缓冲流中而不是普通"表中?

When are gonna be available in BigQuery? Why are stored in the buffer stream instead of the "normal" table?

流式缓冲区 - Google BigQuery [英] Streaming buffer - Google BigQuery

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

流式缓冲区 - Google BigQuery [英] Streaming buffer - Google BigQuery

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭