HIVE:为什么Hive在表名Vs中的选择列上生成mapreduce作业,而不是从tablename中为select *生成mapreduce? [英] HIVE : Why does Hive generate mapreduce job on select column from tablename Vs not generating mapreduce for select * from tablename?

查看:173
本文介绍了HIVE:为什么Hive在表名Vs中的选择列上生成mapreduce作业,而不是从tablename中为select *生成mapreduce?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么Hive会在表名Vs中的选择列上生成mapreduce作业,而不是为表selectname中的select *生成mapreduce?

当解析方案

像这样的一个简单的语句被执行 select * from tablename ,hive所做的只是从存储在hdfs中的文件中获取数据,并以列式输出格式输出。基本上它会生成一个声明,如

  hadoop fs -cat hdfs://schemaname/tablename.txt 
hadoop fs - cat hdfs://schemaname/tablename.rc
hadoop fs -cat hdfs://schemaname/tablename.orc

或以表格文件的任何格式存储。



如果您尝试选择列或在查询中添加where子句或使用任何聚合表中,MR出现的原因很明显。

Why does Hive generate mapreduce job on select column from tablename Vs not generating mapreduce for select * from tablename?

解决方案

When a simple statement like this is executed select * from tablename, what hive does is simply to fetch the data from the file stored in hdfs and bring it out in a columnar output format. Basically it generates a statement like

hadoop fs -cat hdfs://schemaname/tablename.txt
hadoop fs -cat hdfs://schemaname/tablename.rc
hadoop fs -cat hdfs://schemaname/tablename.orc

Or in whichever format your table's file is stored.

If you try selecting a column or adding a where clause to the query or using any aggregate on the table, MR comes into picture for obvious reasons.

这篇关于HIVE:为什么Hive在表名Vs中的选择列上生成mapreduce作业,而不是从tablename中为select *生成mapreduce?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆