我应该选择1.x,2.2和0.23之一的hadoop版本 [英] Which hadoop version should I choose among 1.x, 2.2 and 0.23

查看:124
本文介绍了我应该选择1.x,2.2和0.23之一的hadoop版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,我是Hadoop的新手,对版本名称感到困惑,我应该在1.x(很好的支持和学习资源),2.2或0.23中使用哪一个。



我读过hadoop从v0.23完全转向YARN( link1 )。但与此同时,它在网络上的所有hadoop v2.0正在转向YARN( link2 a>),我可以在Hadoop 2.2本身中看到YARN配置文件。



>

谢谢。

更新

谢谢所有人的回复。
我最终使用了hadoop2.2,因为所有着名的教程和资源都过时了,不过我找到了一本开始使用v2.2的好书。


Hadoop:权威指南,第三版作者:Tom White(在此购买


支持hadoop v2.2。


$ b 源代码在github上发布 https://github.com/tomwhite/hadoop-book



在github上提到,本书的代码用

 此版本的代码已经过测试:
* Hadoop 1.2.1 / 0.22.0 / 0.23.x /2.2.0
* Avro 1.5.4
* Pig 0.9.1
* Hive 0.8.0
* HBase 0.90.4 / 0.94.15
* ZooKeeper 3.4.2
* Sqoop 1.4.0孵化
* MRUnit 0.8.0孵化

希望它有帮助.. !!!

解决方案

有几个主动发布系列。 1.x发行版系列是0.20
发行版系列的延续。在0.23发布后的几个星期,原来被称为0.20.205的0.20分支重新编号为1.0。在0.20.205和1.0之间几乎没有功能差异。这只是一个重新编号。


0.23包含几个主要的新功能,包括一个名为MapReduce 2的新MapReduce运行时,在一个名为YARN(另一个资源谈判器)的新系统上实现,资源管理系统运行分布式应用程序。同样,2.x版本是0.23版本系列的延续。所以2.2也支持YARN。

根据 Hadoop 2.2发行说明 a>



我建议从 Cloudera 发行版,因为您刚刚开始学习。 CDH 4.5包含您正在寻找的YARN功能。您也可以尝试 HortonWorks 发行版。与这些供应商合作的优势在于,您不必担心Hive,Pig等组件的版本与您的Hadoop安装配合使用。


Hello I am new to Hadoop and pretty confused with the version names and which one should I use among 1.x ( great support and learning resources ), 2.2 or 0.23.

I have read that hadoop is moving to YARN completely from v0.23 ( link1 ).
But at the same time its all over the web that hadoop v2.0 is moving to YARN ( link2 ) and I can see the YARN configuration files in Hadoop 2.2 itself.

  • But since 0.23 seems to be the latest version to me, Does 2.2 also support YARN ? ( Refer link 1, it says hadoop will support YARN from v0.23 )
  • And as a beginner which version should I go for 1.x or 2.x for learning perspective of hadoop.
  • Are other technologies that works with hadoop like pig, hive etc. available with the latest version of hadoop?

Thanks.

UPDATE
Thankyou all for replying. I ended up using hadoop2.2 and since all famous tutorials and resources are outdated, though I found one good book to get started with v2.2.

"Hadoop: The Definitive Guide, Third Edition" by Tom White (Buy Here)

supports hadoop v2.2.

The source code is give on github https://github.com/tomwhite/hadoop-book

as mentioned on github, the code of the book is tested with

This version of the code has been tested with:
 * Hadoop 1.2.1/0.22.0/0.23.x/2.2.0
 * Avro 1.5.4
 * Pig 0.9.1
 * Hive 0.8.0
 * HBase 0.90.4/0.94.15
 * ZooKeeper 3.4.2
 * Sqoop 1.4.0-incubating
 * MRUnit 0.8.0-incubating

hope it helps..!!!

解决方案

There are a few active release series. The 1.x release series is a continuation of the 0.20 release series. A few weeks after 0.23 released, the 0.20 branch formerly known as 0.20.205 was renumbered 1.0. There is next to no functional difference between 0.20.205 and 1.0. This is just a renumbering.

The 0.23 includes several major new features includes a new MapReduce runtime, called MapReduce 2, implemented on a new system called YARN (Yet Another Resource Negotiator), which is a general resource management system for running distributed applications. Similarly, 2.x release is a continuation of the 0.23 release series. So the 2.2 also support YARN.

According to Hadoop 2.2 release note

  • 1.2.X - current stable version, 1.2 release

  • 2.2.X - current stable 2.x version

  • 0.23.X - similar to 2.X.X but missing NN HA.

I would suggest starting with Cloudera distribution since you just start learning. The CDH 4.5 includes the YARN feature you are looking for. You can also try HortonWorks distribution. The advantage of going with these vendors is that you do not need to worry about which version of components such as Hive, Pig to work with your Hadoop installation.

这篇关于我应该选择1.x,2.2和0.23之一的hadoop版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆