BigQuery异步查询作业 - fetch_results()方法返回错误的值数 [英] BigQuery async query job - the fetch_results() method returns wrong number of values

查看:305
本文介绍了BigQuery异步查询作业 - fetch_results()方法返回错误的值数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用BigQuery Client API编写Python代码,并尝试使用异步查询代码(作为代码示例在各处编写),并且在fetch_data()方法调用时失败。 Python错误与错误:
$ b


ValueError:解包太多值


因此,3个返回值(rows,total_count,page_token)似乎是不正确的返回值数量。但是,我找不到任何有关此方法应该返回的文档 - 除了大量代码示例只显示这3个返回结果。



以下是显示我在做什么的代码(不包括'client'变量的初始化或导入的库,这在我的代码中较早发生)。

 #--->设置并启动异步查询作业
job_id = str(uuid.uuid4())
job = client.run_async_query(job_id,query)
job.destination = temp_tbl
job .write_disposition ='WRITE_TRUNCATE'
job.begin()
print'job started ...'

#--->监控工作完成
retry_count = 360
,而retry_count> 0和job.state!='DONE':
print'等待工作完成...'
retry_count - = 1
time.sleep(1)
工作。 reload()


if job.state =='DONE':
print'job DONE。'
page_token = None
total_count = None
rownum = 0
job_results = job.results()

而真:

#---- ----下一行代码出错...
rows,total_count,page_token = job_results.fetch_data(max_results = 10,page_token = page_token)

用于行中的行:
rownum + = 1
printRow编号%d%rownum

如果page_token是None:
print'batch of end。'
break

在异步查询作业中,job_results.fetch_data(...)方法调用期望的具体返回结果是什么?

解决方案

看起来就像你是对的!代码不再返回这3个参数。



正如您在提交来自公共存储库 fetch_data 现在会返回 HTTPIterator class(猜我之前没有意识到这一点,因为我有一个Docker镜像,



我发现返回结果的唯一方法就是这样做:

  iterator = job_results.fetch_data()
data = []
for page in iterator._page_iter(False):
data.extend([page.next()for i in range(page.num_items)])

请注意,现在我们不必再管理 pageTokens ,它大部分已经自动化。





我刚刚意识到你可以通过这样做得到结果:

  results = list(job_results.fetch_data())

承认它现在比以前更容易!


I am writing Python code with the BigQuery Client API, and attempting to use the async query code (written everywhere as a code sample), and it is failing at the fetch_data() method call. Python errors out with the error:

ValueError: too many values to unpack

So, the 3 return values (rows, total_count, page_token) seem to be the incorrect number of return values. But, I cannot find any documentation about what this method is supposed to return -- besides the numerous code examples that only show these 3 return results.

Here is a snippet of code that shows what I'm doing (not including the initialization of the 'client' variable or the imported libraries, which happen earlier in my code).

#---> Set up and start the async query job
job_id = str(uuid.uuid4())
job = client.run_async_query(job_id, query)
job.destination = temp_tbl
job.write_disposition = 'WRITE_TRUNCATE'
job.begin()
print 'job started...'

#---> Monitor the job for completion
retry_count = 360
while retry_count > 0 and job.state != 'DONE':
    print 'waiting for job to complete...'
    retry_count -= 1
    time.sleep(1)
    job.reload()


if job.state == 'DONE':
     print 'job DONE.'
     page_token = None
     total_count = None
     rownum = 0
     job_results = job.results()

     while True:

         # ---- Next line of code errors out...
         rows, total_count, page_token = job_results.fetch_data( max_results=10, page_token=page_token )

         for row in rows:
             rownum += 1
             print "Row number %d" % rownum

             if page_token is None:
                 print 'end of batch.'
                 break

What are the specific return results I should expect from the job_results.fetch_data(...) method call on an async query job?

解决方案

Looks like you are right! The code no longer return these 3 parameters.

As you can see in this commit from the public repository, fetch_data now returns an instance of the HTTPIterator class (guess I didn't realize this before as I have a docker image with an older version of the bigquery client installed where it does return the 3 values).

The only way that I found to return the results was doing something like this:

iterator = job_results.fetch_data()
data = []
for page in iterator._page_iter(False):
    data.extend([page.next() for i in range(page.num_items)])

Notice that now we don't have to manage pageTokens anymore, it's been automated for the most part.

[EDIT]:

I just realized you can get results by doing:

results = list(job_results.fetch_data())

Got to admit it's way easier now then it was before!

这篇关于BigQuery异步查询作业 - fetch_results()方法返回错误的值数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆