查询执行时,Hive是否运行hadoop? [英] Does hive run hadoop when a query is executed?
问题描述
我试图了解hive和hadoop如何相互作用。从我读过的教程中,我看到在运行HIVE查询之前,您运行map / reduce作业来获取输入数据。这对我来说似乎适得其反,如果我已经运行map / reduce作业并以易于解析的格式获取数据,为什么不将数据放入传统数据库中。
I am trying to understand how hive and hadoop interact. From the tutorials I have read I appears that prior to running HIVE queries you run a map / reduce job to get the input data. This seems counterproductive to me, if I have already run the map / reduce job and gotten the data in an easily parsable format why would I not put the data into a traditional database.
感谢您的帮助,
Nathan
Thanks for your help, Nathan
推荐答案
Hive对存储在HDFS上的文件进行操作。除了最简单的查询之外,配置单元都会生成并运行mapreduce作业。对于非常简单的查询( SELECT * FROM MyTable
),它只会将这些文件从磁盘中流出。
Hive operates on files that are stored on HDFS. For anything other than the simplest queries, hive generates and runs mapreduce jobs. For very simple queries (SELECT * FROM MyTable
) it will just stream the files off of disk.
数据不需要来自MapReduce--它可以是上传到HDFS的简单文本文件。请参阅 http://developer.yahoo.com/hadoop/tutorial/module2.html# commandref
The input data doesn't need to come from MapReduce- it can be a simple text file uploaded to HDFS. See http://developer.yahoo.com/hadoop/tutorial/module2.html#commandref
这篇关于查询执行时,Hive是否运行hadoop?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!