分页 BigQuery [英] Paginating BigQuery

查看:29
本文介绍了分页 BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建类似于 Google 的 BigQuery 仪表板的内容,但具有预定义的查询/视图.我遇到的问题是对数据进行分页.

I'm trying to create something similar to Google's BigQuery dashboard except with predefined queries/views. The problem I'm running into is paginating the data.

tabledata 端点支持分页,您可以指定开始索引或使用页面标记,允许我做这样的事情:

The tabledata endpoint supports pagination in that you can specify a start index or use a page token, allowing me to do something like this:

query_reply = table_data_job.list(projectId=settings.PROJECT_ID,
                                  datasetId=settings.DATASET_ID,
                                  tableId=table,
                                  startIndex=offset,
                                  maxResults=page_size).execute()

问题在于我想运行特定的查询(或者,至少,对表数据结果进行排序).

The problem with this is that I would like to run specific queries (or, at the very least, order the table data results).

query_data = {'query': 'SELECT * FROM my_dataset.foo_table LIMIT %s' % page_size}
query_reply = job_collection.query(projectId=settings.PROJECT_ID,
                                   body=query_data).execute()

据我所知,上面的代码没有办法进行抵消.这只是 BigQuery 不适合的东西吗?我想另一种方法是在内存中进行分页并处理较小的结果集?

To my knowledge, there's no way to do an offset with the above code. Is this just something BigQuery is not suited for? I guess the alternative would be to do the pagination in memory and work on smaller result sets?

推荐答案

BigQuery 查询结果是表格.因此,您可以运行查询并从结果中获取目标表,然后使用 tabledata.list() api 对结果进行分页.或者,您可以从回复中获取工作 ID 并使用 jobs.getQueryResults(),支持分页.

BigQuery query results are tables. So you can run a query and get the destination table from the result and then page through the results using the tabledata.list() api. Alternately you can get the job id from the reply and use jobs.getQueryResults(), which has pagination support.

这篇关于分页 BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆