Hive数据的减速器 [英] Reducers for Hive data

查看:224
本文介绍了Hive数据的减速器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新手。我很想知道如何将reducer设置为不同的配置单元数据集。它是基于处理的数据的大小吗?或者是所有缩减器的默认设置?



例如,5GB数据需要多少个缩减器?将相同数量的reducer设置为较小的数据集?



预先感谢!干杯!

解决方案

在开放源码配置单元(和可能的EMR)中,

<$ $ b $(
$ #reducers =(映射器的输入字节数)
/(hive.exec.reducers.bytes.per.reducer)

默认的hive.exec.reducers.bytes.per.reducer是1G。



reducer还取决于输入文件的大小
您可以通过设置属性hive.exec.reducers.bytes.per.reducer来更改:



更改hive-site.xml



hive.exec.reducers.bytes.per.reducer 1000000



或使用set



hive -eset hive.exec.reducers.bytes.per.reducer = 100000


I'm a novice. I'm curious to know how reducers are set to different hive data sets. Is it based on the size of the data processed? Or a default set of reducers for all?

For example, 5GB of data requires how many reducers? will the same number of reducers set to smaller data set?

Thanks in advance!! Cheers!

解决方案

In open source hive (and EMR likely)

# reducers = (# bytes of input to mappers)
             / (hive.exec.reducers.bytes.per.reducer)

default hive.exec.reducers.bytes.per.reducer is 1G.

Number of reducers depends also on size of the input file You could change that by setting the property hive.exec.reducers.bytes.per.reducer:

either by changing hive-site.xml

hive.exec.reducers.bytes.per.reducer 1000000

or using set

hive -e "set hive.exec.reducers.bytes.per.reducer=100000

这篇关于Hive数据的减速器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆