在BigQuery SQL中将表格分割成多个表格 [英] Split a table into multiple tables in BigQuery SQL

查看:47
本文介绍了在BigQuery SQL中将表格分割成多个表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Google BigQuery中将具有1.2亿行的表拆分为多个大小相等的表.我找到了这个 answer 的问题,该问题询问有关获取表的行号的问题.示例解决方案是:

I would like to split a table with 120 million rows into multiple equally-sized tables in Google BigQuery. I found this answer to a question that asks about getting row numbers for a table. The example solution was:

SELECT
  ROW_NUMBER() OVER() row_number,
  contributor_username,
FROM (
  SELECT contributor_username,
  FROM [publicdata:samples.wikipedia]
  GROUP BY contributor_username
)

这将允许您获取row_number,然后可以通过手动选择来创建表

which would allow you to get a row_number, and then you could create a table by manually selecting

WHERE row_number BETWEEN x AND y

这适用于示例中的表,但是对于具有117MM行的表,使用 ROW_NUMBER()会产生资源超出错误.

This works for the table in the example, but using ROW_NUMBER() yields a resources exceeded error for a table with 117MM rows.

然后我尝试了以下想法:

I then tried the following idea:

  SELECT field1, field2
  FROM (
      SELECT field1, field2, ntile(100) over (order by row_id ) as ntile
      FROM (
        SELECT  hash(some_unique_key) as row_id, field1, field2
        FROM table_with_177_MM_rows
      )
  )
  WHERE ntile = 1

然后我需要将其运行为ntile = [1,...,100].这也失败了.有什么方法可以将GBQ中具有唯一键的大表拆分成较小的表?

I would then need to run this for ntile=[1,...,100]. This also failed. Is there any way to split a large table with a unique key in GBQ into smaller tables?

推荐答案

类似于上面的第二个示例,但不使用ntile窗口函数:

Similar to my second example above, but without using the ntile window function:

  SELECT field1, field2
  FROM (
        SELECT hash(some_unique_key) as row_id, field1, field2
        FROM table_with_177_MM_rows
  )
  WHERE abs(row_id) % {n_tables} = {table_id}

对于 n_tables = 10 ,您将运行 table_id = {0,1,2,... 9} 来获取每个表.也许有更好的方法?

For n_tables = 10, you would run for table_id={0,1,2,...9} to get each table. Maybe there's a better way?

这篇关于在BigQuery SQL中将表格分割成多个表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆