Google Cloud BigQuery导入不适用于应用程序引擎项目 [英] Google Cloud BigQuery Import not working in app engine project
问题描述
我已经使用以下代码构建了一个应用引擎项目,将数据从Google云端存储桶移到bigquery表中
导入argparse
导入时间
从google.cloud导入uuid
import bigquery
def load_data_from_gcs(dataset_name,table_name,source):
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset(dataset_name)
table = dataset.table(table_name)
job_name = str(uuid.uuid4())
job = bigquery_client.load_table_from_storage(
job_name,table,source)
job.begin()
wait_for_job(job)
print('Loaded {} rows into {}:{}。'。format(
job.output_rows,dataset_name,table_name))
def wait_for_job(作业):
而真:
job.reload()
如果job.state =='完成':
如果job.error_result:
raise RuntimeError (job.erro r_result)
返回
time.sleep(1)
if __name__ =='__main__':
parser = argparse.ArgumentParser(
$ description $ _ $ doc__,
formatter_class = argparse.RawDescriptionHelpFormatter)
parser.add_argument('dataset_name')
parser.add_argument('table_name')
parser.add_argument(
'source',help ='要加载的Google Cloud Storage对象。必须以'
'格式gs:// bucket_name / object_name')
args = parser.parse_args()
load_data_from_gcs(
args .dataset_name,
args.table_name,
args.source)
我有还改变了默认的app.yaml文件,就像上面的文件一样,并删除了webapp2库的条目,我的app.yaml文件看起来像这样
应用程序:gcstobq
版本:1
运行时:python27
api_version:1
线程安全:是
处理程序:
- url:/ favicon \.ico
static_files:favicon.ico
上传:favicon \.ico
- url:。*
script:main.app
由于我是python和app引擎的新手,我不知道是否需要在main.py中包含库定义文件到app.yaml中,如果我需要使用命令行工具来运行这个应用程序。
请让我知道如果我在这里丢失了什么?
Google Cloud使用新的Python命名空间格式(如果您查看源代码,您会注意到没有
__ init __。py code>在目录结构中)。这在Python3.3中被改变了,使用 PEP-420 < a>
幸运的是,在Python 2.7中,您可以通过避免隐式导入来轻松解决此问题。只需将它添加到文件的顶部(在任何其他导入之前)即可获得Python 3的行为: $ _ b
from __future__ import absolute_import
code>
I have used the following code to build an app engine project to move data from google cloud bucket into the bigquery table
import argparse
import time
import uuid
from google.cloud import bigquery
def load_data_from_gcs(dataset_name, table_name, source):
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset(dataset_name)
table = dataset.table(table_name)
job_name = str(uuid.uuid4())
job = bigquery_client.load_table_from_storage(
job_name, table, source)
job.begin()
wait_for_job(job)
print('Loaded {} rows into {}:{}.'.format(
job.output_rows, dataset_name, table_name))
def wait_for_job(job):
while True:
job.reload()
if job.state == 'DONE':
if job.error_result:
raise RuntimeError(job.error_result)
return
time.sleep(1)
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('dataset_name')
parser.add_argument('table_name')
parser.add_argument(
'source', help='The Google Cloud Storage object to load. Must be in '
'the format gs://bucket_name/object_name')
args = parser.parse_args()
load_data_from_gcs(
args.dataset_name,
args.table_name,
args.source)
I have also altered the default app.yaml file as the above file and deleted the webapp2 library entry and my app.yaml file looks like this
application: gcstobq
version: 1
runtime: python27
api_version: 1
threadsafe: yes
handlers:
- url: /favicon\.ico
static_files: favicon.ico
upload: favicon\.ico
- url: .*
script: main.app
As I am new to python and app engine I dont know if I need to include the libraries defines in main.py file into the app.yaml and if i need to run this app using the command line tool.
Please let me know if I am missing something here?
解决方案 Google Cloud uses the new Python namespace format (if you look at the source you'll notice that there's no __init__.py
in the directory structure). This was changed in Python 3.3 with PEP-420
Fortunately in Python 2.7 you can fix this easily by avoiding implicit imports. Just add this to the very top of your file (before any other imports) to get the Python 3 behavior:
from __future__ import absolute_import
这篇关于Google Cloud BigQuery导入不适用于应用程序引擎项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!