在Docker容器上使用Hadoop和Spark [英] Using Hadoop and Spark on Docker containers

查看:234
本文介绍了在Docker容器上使用Hadoop和Spark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用大数据分析来做我的工作。我已经实现了在容器内创建容器的所有docker的东西。我现在是大数据的新手,而且我已经知道,使用Hadoop进行HDFS,而在Hadoop本身使用Spark而不是MapReduce,这是网站和应用程序在速度变化时的最佳方式(是吗?)。这将在我的Docker容器上工作吗?

I want to use Big Data Analytics for my work. I have already implemented all the docker stuff creating containers within containers. I am new to Big Data however and I have come to know that using Hadoop for HDFS and using Spark instead of MapReduce on Hadoop itself is the best way for websites and applications when speed matters (is it?). Will this work on my Docker containers? It'd be very helpful if someone could direct me somewhere to learn more.

推荐答案

您可以尝试使用Cloudera QuickStart Docker Image开始。请查看 https://hub.docker.com/r/cloudera/quickstart/ 。该Docker映像支持Cloudera的Hadoop平台和Cloudera Manager的单节点部署。这个码头图像也支持火花。

You can try playing with Cloudera QuickStart Docker Image to get started. Please take a look at https://hub.docker.com/r/cloudera/quickstart/. This docker image supports single-node deployment of Cloudera's Hadoop platform, and Cloudera Manager. Also this docker image supports spark too.

这篇关于在Docker容器上使用Hadoop和Spark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆