这决定了地图任务的数量并减少了蜂巢中的任务? [英] which determines the number of map tasks and reduce tasks in hive?
问题描述
hive> select * from emp; 将没有地图,减少将开始。意味着我们只是在倾销这些数据。
如果我想要这么多地图并减少开始查询的时间。
hive>从emp组中按名称选择count(*);
如果我们添加解释关键字在查询之前会显示多少 map和reduce会启动。
$ b
hive> explain select count(*)从emp group by name;
I use hive to run a query "select * from T1,T2 where T1.a=T2.b", and the schema is T1(a int, b int),T2(a int,b int), when it runs, 6 map tasks and one reduce task generated, and I want to ask that, which determined the number of map tasks and reduce tasks? is the data volume?
hive> select * from emp; Then there will be no map and reduce will start. Means we are only dumping the data.
If I want so how many map and reduce start when I am hitting query.
hive> select count(*) from emp group by name;
If we added explain keyword before the query it will going show how many map and reduce will get start.
hive> explain select count(*) from emp group by name;
这篇关于这决定了地图任务的数量并减少了蜂巢中的任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!