未能增加Hive Mapper任务？ [英] Fail to Increase Hive Mapper Tasks?

查看：146 发布时间：2018/5/31 18:55:14 hadoop hive

本文介绍了未能增加Hive Mapper任务？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个托管Hive表，其中只包含一个150MB文件。然后我做从tbl中选择count（*），并使用2个映射器。我想将它设置为更大的数字。

首先，我尝试'set mapred.max.split.size = 8388608;'，因此希望它会使用19个映射器。但它只使用3.不知何故，它仍然将输入分割为64MB。我也使用了'set dfs.block.size = 8388608;'，不工作。

然后我尝试了一个香草map-reduce工作来做同样的事情。它最初使用了3个映射器，当我设置mapred.max.split.size时，它使用了19.所以问题在于Hive，我猜想。

我读了一些的Hive源代码，像CombineHiveInputFormat，ExecDriver等无法找到线索。

我还可以使用其他设置吗？ javadba的答案与我从Hive邮件列表收到的答案相同，以下是解决方案：

  set hive.input.format = org。 apache.hadoop.hive.ql.io.HiveInputFormat; 
 set mapred.map.tasks = 20; 
从dw_stage.st_dw_marketing_touch_pi_metrics_basic中选择count（*）;

来自邮件列表：

似乎HIVE使用的是旧版Hadoop MapReduce API，因此mapred.max.split.size将无法使用。

稍后我会深入研究源代码。

I have a managed Hive table, which contains only one 150MB file. I then do "select count(*) from tbl" to it, and it uses 2 mappers. I want to set it to a bigger number.

First I tried 'set mapred.max.split.size=8388608;', so hopefully it will use 19 mappers. But it only uses 3. Somehow it still split the input by 64MB. I also used 'set dfs.block.size=8388608;', not working either.

Then I tried a vanilla map-reduce job to do the same thing. It initially uses 3 mappers, and when I set mapred.max.split.size, it uses 19. So the problem lies in Hive, I suppose.

I read some of the Hive source code, like CombineHiveInputFormat, ExecDriver, etc. can't find a clue.

What else settings can I use?
解决方案
I combined @javadba 's answer with that I received from Hive mailing list, here's the solution:
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set mapred.map.tasks = 20; select count(*) from dw_stage.st_dw_marketing_touch_pi_metrics_basic;
From the mailing list:

It seems that HIVE is using the old Hadoop MapReduce API and so mapred.max.split.size won't work.

I would dig into source code later.

这篇关于未能增加Hive Mapper任务？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

未能增加Hive Mapper任务？ [英] Fail to Increase Hive Mapper Tasks?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

未能增加Hive Mapper任务？ [英] Fail to Increase Hive Mapper Tasks?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭