如何在应用引擎和应用程序上使用 Bigquery 流式插入Python [英] How to use Bigquery streaming insertall on app engine & python

查看:21
本文介绍了如何在应用引擎和应用程序上使用 Bigquery 流式插入Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想开发一个直接将数据流式传输到 BigQuery 表的应用引擎应用程序.

I would like to develop an app engine application that directly stream data into a BigQuery table.

根据 Google 的文档,有一种将数据流式传输到 bigquery 的简单方法:

According to Google's documentation there is a simple way to stream data into bigquery:

https://developers.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples(注意:在上面的链接中,您应该选择 python 选项卡而不是 Java)

https://developers.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples (note: in the above link you should select the python tab and not Java)

以下是关于如何对流式插入进行编码的示例代码片段:

Here is the sample code snippet on how streaming insert should be coded:

body = {"rows":[
{"json": {"column_name":7.7,}}
]}

response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

虽然我已经下载了客户端 api,但我没有找到对上述 Google 示例中引用的bigquery"模块/对象的任何引用.

Although I've downloaded the client api I didn't find any reference to a "bigquery" module/object referenced in the above Google's example.

bigquery 对象(来自代码段)应位于何处?

Where is the the bigquery object (from snippet) should be located?

谁能展示一个更完整的方法来使用这个片段(使用正确的导入)?

Can anyone show a more complete way to use this snippet (with the right imports)?

我已经搜索了很多,发现文档混乱且不完整.

I've Been searching for that a lot and found documentation confusing and partial.

推荐答案

Minimal working (只要你为你的项目填写了正确的 id) 示例:

Minimal working (as long as you fill in the right ids for your project) example:

import httplib2
from apiclient import discovery
from oauth2client import appengine

_SCOPE = 'https://www.googleapis.com/auth/bigquery'

# Change the following 3 values:
PROJECT_ID = 'your_project'
DATASET_ID = 'your_dataset'
TABLE_ID = 'TestTable'


body = {"rows":[
    {"json": {"Col1":7,}}
]}

credentials = appengine.AppAssertionCredentials(scope=_SCOPE)
http = credentials.authorize(httplib2.Http())

bigquery = discovery.build('bigquery', 'v2', http=http)
response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

print response

正如 Jordan 所说:请注意,这使用 appengine 机器人向 BigQuery 进行身份验证,因此您需要将机器人帐户添加到数据集的 ACL.请注意,如果您还想使用机器人运行查询,不仅仅是流媒体,您还需要机器人成为项目团队"的成员,以便它有权运行作业."

As Jordan says: "Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset. Note that if you also want to use the robot to run queries, not just stream, you need the robot to be a member of the project 'team' so that it is authorized to run jobs."

这篇关于如何在应用引擎和应用程序上使用 Bigquery 流式插入Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆