你能解释一下何时以及为什么在 hive 中调用 mapreduce [英] Can you explain when and why mapreduce is invoked in hive
问题描述
select * from Table_name limit 5;
select col1_name,col2_name from table_name limit 5;
当我运行第一个查询时,不会调用 MapReduce,而对于其他查询,调用 MapReduce.你能解释一下原因吗.
When i run the first query there will be no MapReduce invoked, while for other the MapReduce is invoked. Could you please explain the reason.
推荐答案
要理解其中的原因,首先我们需要知道map和reduce阶段是什么意思:-
To understand the reason, first we need to know what map and reduce phases mean:-
地图:基本上是一个过滤器,按排序顺序过滤和组织数据.例如它将从第二个查询中的一行过滤 col1_name、col2_name.但是,在第一个查询中,您正在阅读每一列,不需要过滤.因此没有映射阶段
Map: Basically a filter which filters and organizes data in sorted order. For e.g. It will filter col1_name, col2_name from a row in the second query. However in 1st query you are reading every column, no filtering is required. Hence no Map phase
Reduce:Reduce 只是跨行的汇总操作数据.例如一栏的总和!在这两个查询中,您都不需要任何摘要数据.因此没有减速器.
Reduce: Reduce is just summary operation data across the rows. for e.g. sum of a coloumn! In both the queries you don't need any summary data. Hence no reducer.
因此,第一个查询没有 map-reduce,第二个查询只有 mapper 而没有 reduce.
so, 1st query as no map-reduce, 2nd query has only mappers but no reduces.
这篇关于你能解释一下何时以及为什么在 hive 中调用 mapreduce的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!