Hive比Spark更快吗? [英] Is Hive faster than Spark?
问题描述
阅读
越低越好
After reading What is hive, Is it a database?, a colleague yesterday mentioned that he was able to filter a 15B table, join it with another table after doing a "group by", which resulted in 6B records, in only 10 minutes! I wonder if this would be slower in Spark, since now with the DataFrames, they may be comparable, but I am not sure, thus the question.
Is Hive faster than Spark? Or this question doesn't have meaning? Sorry, for my ignorance.
He uses the latest Hive, which from seems to be using Tez.
Hive is just a framework that gives sql functionality to MapReduce type workloads.
These workloads can run on mapreduce or yarn.
So comparing Hive on tez vs Hive on spark. Nice article below discussing this When to go with ETL on Hive using Tez VS When to go with Spark ETL? (Gist use Hive on spark if not sure).
Lower the better
这篇关于Hive比Spark更快吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!