在Windows上从Python制作Google BigQuery [英] Making a Google BigQuery from Python on Windows

查看:358
本文介绍了在Windows上从Python制作Google BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在其他数据服务中做一些非常简单的事情。我试图做一个相对简单的SQL查询并将其作为python中的数据框返回。我使用的是Windows 10并使用Phython 2.7(特别是Canopy 1.7.4)。通常这可以通过 pandas.read_sql_query ,但由于BigQuery的一些细节,它们需要不同的方法 pandas.io.gbq.read_gbq



这种方法可以正常工作,除非您想创建一个大查询。如果您在BigQuery上进行Big Query,您将收到错误消息:GenericGBQException:原因:responseTooLarge,错误代码为




消息:响应太大而无法返回。考虑在作业配置中将allowLargeResults设置为true。有关详情,请参阅 https://cloud.google.com/bigquery/troubleshooting-errors




与我的情况相关



Python BigQuery allowLargeResults和pandas.io.gbq



一个解决方案是针对python 3,因此它是一个非启动器。另一个错误是因为我无法将我的凭据设置为Windows环境变量。




ApplicationDefaultCredentialsError:Application Default Credentials不可用。如果在Google Compute Engine中运行,则它们可用。否则,必须定义环境变量GOOGLE_APPLICATION_CREDENTIALS,指向定义凭据的文件。请参阅 https://developers.google.com/accounts/docs/application-default -credentials 了解更多信息。 JSON凭证文件,我已经将它设置为一个环境变量,但我仍然知道如何处理上述错误。我需要用python以某种方式加载吗?它似乎在寻找它,但无法找到是正确的。在这种情况下,是否有一种特殊的方式将其设置为环境变量?

解决方案

pd.read_gbq 函数中从传统到标准的默认方言。

  pd.read_gbq(查询,'my-super-project',dialect ='standard')

确实,您可以在Big Query文档中阅读AllowLargeResults参数:
$ b


AllowLargeResults:对于标准SQL查询,此标志被忽略
并且总是允许大的结果。



I am trying to do something which is very simple in other data services. I am trying to make a relatively simple SQL query and return it as a dataframe in python. I am on Windows 10 and using Phython 2.7 (specifically Canopy 1.7.4)

Typically this would be done with pandas.read_sql_query but due to some specifics with BigQuery they require a different method pandas.io.gbq.read_gbq

This method works fine unless you want to make a Big Query. If you make a Big Query on BigQuery you get the error


GenericGBQException: Reason: responseTooLarge, Message: Response too large to return. Consider setting allowLargeResults to true in your job configuration. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors


This was asked and answered before in this ticket but neither of the solutions are relevant for my case

Python BigQuery allowLargeResults with pandas.io.gbq

One solution is for python 3 so it is a nonstarter. The other is giving an error due to me being unable to set my credentials as an environment variable in windows.


ApplicationDefaultCredentialsError: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.


I was able to download the JSON credentials file and I have set it as an environment variable in the few ways I know how but I still get the above error. Do I need to load this in some way in python? It seems to be looking for it but unable to find is correctly. Is there a special way to set it as an environment variable in this case?

解决方案

You can do it in Python 2.7 by changing the default dialect from legacy to standard in pd.read_gbq function.

pd.read_gbq(query, 'my-super-project', dialect='standard')

Indeed, you can read in Big Query documentation for the parameter AllowLargeResults:

AllowLargeResults: For standard SQL queries, this flag is ignored and large results are always allowed.

这篇关于在Windows上从Python制作Google BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆