蜂巢中的表格歪斜 [英] Skewed tables in Hive

查看:135
本文介绍了蜂巢中的表格歪斜的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习蜂房,遇到了倾斜的桌子。帮助我理解它。



Hive中的偏斜表格是什么?



我们如何创建倾斜的表格?

它如何影响效果?

解决方案

是Hive中的倾斜表吗?



倾斜表是一种特殊类型的表,其中经常出现的值(重偏斜)被分成单独的文件和其余的值转到其他文件中。



我们如何创建倾斜表?



 创建表格< T> (模式)在('c1','c2')上被(键)倾斜[[STORED as DIRECTORIES]; 

示例:

  create table T(c1 string,c2 string)skewed by(c1)on('x1')

效果如何?



通过指定偏斜值Hive会将它们分开自动进入单独的文件,并在查询期间考虑到这一事实,以便它可以跳过(或包含)整个文件,从而提高性能。


I am learning hive and came across skewed tables. Help me understanding it.

What are skewed tables in Hive?

How do we create skewed tables?

How does it effect performance?

解决方案

What are skewed tables in Hive?

A skewed table is a special type of table where the values that appear very often (heavy skew) are split out into separate files and rest of the values go to some other file..

How do we create skewed tables?

create table <T> (schema) skewed by (keys) on ('c1', 'c2') [STORED as DIRECTORIES];

Example :

create table T (c1 string, c2 string) skewed by (c1) on ('x1')

How does it effect performance?

By specifying the skewed values Hive will split those out into separate files automatically and take this fact into account during queries so that it can skip (or include) whole files if possible thus enhancing the performance.

这篇关于蜂巢中的表格歪斜的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆