配置单元 - 如何有效地创建表为选择？ [英] Hive - How to efficiently Create Table As Select?

查看：155 发布时间：2018/6/12 14:20:39 hive hiveql

本文介绍了配置单元 - 如何有效地创建表为选择？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个配置单元表， htable ，它分割在 foo 和bar code>。我想创建这个表的一小部分用于实验，所以我认为要做的事情是

  create table new_table像htable; 
 
插入到new_table分区（foo，bar）select * from htable 
其中rand（）< 0.01和foo在（a，b）中

$ c> java.lang.OutOfMemoryError：Java堆空间。有没有更好的方法？

解决方案

添加通过foo分发，bar ：

  insert into new_table partition（foo，bar）select * from htable 
其中rand（）< （a，b）中的0.01和foo 
分配给foo，bar

这会减少内存消耗。

I have a hive table, htable that's partitioned on foo and bar. I want to create a small subset of this table for experiments, so I would think the thing to do would be

create table new_table like htable;

insert into new_table partition (foo, bar) select * from htable
where rand() < 0.01 and foo in (a,b)

This takes forever however and finally fails with a java.lang.OutOfMemoryError: Java heap space. Is there a better way?

解决方案

Add distribute by foo, bar:

    insert into new_table partition (foo, bar) select * from htable
     where rand() < 0.01 and foo in (a,b) 
    distribute by foo, bar

this will reduce memory consumption.

这篇关于配置单元 - 如何有效地创建表为选择？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

配置单元 - 如何有效地创建表为选择？ [英] Hive - How to efficiently Create Table As Select?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

配置单元 - 如何有效地创建表为选择？ [英] Hive - How to efficiently Create Table As Select?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭