Hive数据的减速器 [英] Reducers for Hive data
问题描述
例如,5GB数据需要多少个缩减器?将相同数量的reducer设置为较小的数据集?
预先感谢!干杯!
在开放源码配置单元(和可能的EMR)中,
<$ $ b $(
$ #reducers =(映射器的输入字节数)
/(hive.exec.reducers.bytes.per.reducer)
默认的hive.exec.reducers.bytes.per.reducer是1G。
reducer还取决于输入文件的大小
您可以通过设置属性hive.exec.reducers.bytes.per.reducer来更改:
更改hive-site.xml
或使用set
I'm a novice. I'm curious to know how reducers are set to different hive data sets. Is it based on the size of the data processed? Or a default set of reducers for all?
For example, 5GB of data requires how many reducers? will the same number of reducers set to smaller data set?
Thanks in advance!! Cheers!
In open source hive (and EMR likely)
# reducers = (# bytes of input to mappers)
/ (hive.exec.reducers.bytes.per.reducer)
default hive.exec.reducers.bytes.per.reducer is 1G.
Number of reducers depends also on size of the input file You could change that by setting the property hive.exec.reducers.bytes.per.reducer:
either by changing hive-site.xml
hive.exec.reducers.bytes.per.reducer 1000000
or using set
hive -e "set hive.exec.reducers.bytes.per.reducer=100000
这篇关于Hive数据的减速器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!