AssertionError:内部:未指定默认项目 [英] AssertionError: INTERNAL: No default project is specified
问题描述
气流新手。尝试运行sql并将结果存储在BigQuery表中。
New to airflow. Trying to run the sql and store the result in a BigQuery table.
出现以下错误。不确定在哪里设置default_rpoject_id。
Getting following error. Not sure where to setup the default_rpoject_id.
请帮助我。
错误:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 28, in <module>
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 585, in test
ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 53, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1374, in run
result = task_copy.execute(context=context)
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/operators/bigquery_operator.py", line 82, in execute
self.allow_large_results, self.udf_config, self.use_legacy_sql)
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/bigquery_hook.py", line 228, in run_query
default_project_id=self.project_id)
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/bigquery_hook.py", line 917, in _split_tablename
assert default_project_id is not None, "INTERNAL: No default project is specified"
AssertionError: INTERNAL: No default project is specified
代码:
sql_bigquery = BigQueryOperator(
task_id='sql_bigquery',
use_legacy_sql=False,
write_disposition='WRITE_TRUNCATE',
allow_large_results=True,
bql='''
#standardSQL
SELECT ID, Name, Group, Mark, RATIO_TO_REPORT(Mark) OVER(PARTITION BY Group) AS percent FROM `tensile-site-168620.temp.marks`
''',
destination_dataset_table='temp.percentage',
dag=dag
)
推荐答案
编辑:我终于解决了这个问题,只需添加在单独的python脚本中运行以下代码后,BigQueryOperator任务中的 bigquery_conn_id ='bigquery'
参数。
I finally fixed this problem by simply adding the bigquery_conn_id='bigquery'
parameter in the BigQueryOperator task, after running the code below in a separate python script.
显然您需要指定您的项目D在Airflow UI中的管理->连接中。您必须将其作为JSON对象(例如 project:)来完成。
Apparently you need to specify your project ID in Admin -> Connection in the Airflow UI. You must do this as a JSON object such as "project" : "".
我个人无法在GCP上使用网络服务器,因此这是不可行的。这里有一个程序化解决方案:
Personally I can't get the webserver working on GCP so this is unfeasible. There is a programmatic solution here:
from airflow.models import Connection
from airflow.settings import Session
session = Session()
gcp_conn = Connection(
conn_id='bigquery',
conn_type='google_cloud_platform',
extra='{"extra__google_cloud_platform__project":"<YOUR PROJECT HERE>"}')
if not session.query(Connection).filter(
Connection.conn_id == gcp_conn.conn_id).first():
session.add(gcp_conn)
session.commit()
这些建议来自此处是一个类似的问题。
这篇关于AssertionError:内部:未指定默认项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!