将BigQuery表流式传输到Google Pub/Sub [英] Stream BigQuery table into Google Pub/Sub

本文介绍了将BigQuery表流式传输到Google Pub/Sub的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Google bigQuery表,我想将整个表流式传输到pub-sub主题

I have a Google bigQuery Table and I want to stream the entire table into pub-sub Topic

最简单/快速的方法是什么?

what should be the easy/fast way to do it?

先谢谢您

推荐答案

这实际上取决于表的大小.

That really depends on the size of the table.

如果它是一个小表(几千个记录,几个打ze的列),那么您可以设置一个过程来查询整个表,将响应转换为JSON数组,然后推送到pub-sub.

If it's a small table (a few thousand records, a couple doze columns) then you could setup a process to query the entire table, convert the response into a JSON array, and push to pub-sub.

如果它是一个大表(数百万/数十亿记录,数百列),则必须导出到文件,然后准备/发送到pub-sub

If it's a big table (millions/billions of records, hundreds of columns) you'd have to export to file, and then prepare/ship to pub-sub

这也取决于您的分区策略-如果您将表设置为按日期进行分区,则可以再次查询而不是导出.

It also depends on your partitioning policy - if your tables are set up to partition by date you might be able to, again, query instead of export.

最后但并非最不重要的一点,它还取决于频率-这是一次交易(然后导出)还是一个连续过程(然后使用表修饰符来查询最新数据)?

Last but not least, it also depends on the frequency - is this a one time deal (then export) or a continuous process (then use table decorators to query only the latest data)?

如果您想要一个真正有用的答案,则需要更多信息.

Need some more information if you want a truly helpful answer.

修改

根据您对表格大小的评论,我认为最好的方法是拥有一个脚本,该脚本将:

Based on your comments for the size of the table, I think the best way would be to have a script that would:

  1. 将表格导出为 GCS 作为换行符分隔的JSON

  1. Export the table to GCS as newline delimited JSON

处理文件(逐行读取)并发送到pub-sub

Process the file (read line by line) and send to pub-sub

对于大多数编程语言,有客户端库.我用Python做过类似的事情,这很简单.

There are client libraries for most programming languages. I've done similar things with Python, and it's fairly straight forward.

这篇关于将BigQuery表流式传输到Google Pub/Sub的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆