Hive比Spark更快吗? [英] Is Hive faster than Spark?

查看:356
本文介绍了Hive比Spark更快吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读



越低越好

After reading What is hive, Is it a database?, a colleague yesterday mentioned that he was able to filter a 15B table, join it with another table after doing a "group by", which resulted in 6B records, in only 10 minutes! I wonder if this would be slower in Spark, since now with the DataFrames, they may be comparable, but I am not sure, thus the question.

Is Hive faster than Spark? Or this question doesn't have meaning? Sorry, for my ignorance.

He uses the latest Hive, which from seems to be using Tez.

解决方案

Hive is just a framework that gives sql functionality to MapReduce type workloads.

These workloads can run on mapreduce or yarn.

So comparing Hive on tez vs Hive on spark. Nice article below discussing this When to go with ETL on Hive using Tez VS When to go with Spark ETL? (Gist use Hive on spark if not sure).

Lower the better

这篇关于Hive比Spark更快吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆