设计Azure HDInsights群集? [英] Designing Azure HDInsights Cluster?

查看:133
本文介绍了设计Azure HDInsights群集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对AZURE HDInsights有疑问.我如何根据本地基础结构设计AZURE HDInsights集群?主要参数有哪些 设计集群之前我需要考虑什么? (例如)如果我有100台本地运行的服务器,则需要像这样在我的Cloud Cluster中选择多少个节点. ?!!在AWS中,我们有EMR调整大小计算器和Cluster Planner/Advisor.我们是否有 除了定价计算器,AZURE中是否有任何类似的计划机制?我已经阅读过Microsoft文档,但仍然想通过示例来理解.请澄清并提供您的意见.任何人都可以举一个例子来解释 伟大的.谢谢.

I have a query on AZURE HDInsights. How do I need to design AZURE HDInsights Cluster according to my on-premises infrastructure ? What are the major parameters which I need to consider before designing the cluster ? (For Example) If I have 100 servers running on-premises, how many nodes I need to select in my Cloud Cluster like that. ?!! In AWS we have EMR sizing calculator and Cluster Planner/Advisor. Do we have anything similar planning mechanism in AZURE apart from Pricing Calculator ? I have gone through Microsoft docs but still would like to understand with example. Please clarify and provide your inputs. Could anyone please explain with an example will be really great. Thanks.

推荐答案

Designing an HDInsight environment is different from designing an on-premises Hadoop environment.  The best cluster configuration depends on compute clusters, running at different times, designed to handle different workloads.

分隔 的计算和数据是该体系结构及其功能的关键. 将数据保存在与计算群集不同的位置会带来许多本地硬件无法提供的可能性.

The separation of compute and data is key to this architecture functioning as well as it does.  Keeping the data in a separate location from the compute cluster opens up a number of possibilities not available with on-premises hardware.

选择时群集类型,请记住每种配置的目标工作负载,并牢记强烈考虑Linux 作为您选择的操作系统.

When choosing the cluster type, keep in mind the targeted workloads for each configuration, and remember to strongly consider Linux as your choice of OS.

部署前HDInsight群集,通过确定所需的性能和规模来规划所需的群集容量.这个 规划有助于优化可用性和成本.某些群集容量决策在部署后无法更改.如果性能参数发生变化,则可以在不丢失存储数据的情况下拆卸并重新创建集群.

Before deploying an HDInsight cluster, plan for the desired cluster capacity by determining the needed performance and scale. This planning helps optimize both usability and costs. Some cluster capacity decisions cannot be changed after deployment. If the performance parameters change, a cluster can be dismantled and re-created without losing stored data.

  • 要求进行容量规划的关键问题是:
  • 您应该在哪个地理区域
  • 您需要多少存储空间?
  • 您的群集节点应使用什么大小和类型的虚拟机(VM)?
  • The key questions to ask for capacity planning are:
  • In which geographic region should you deploy your cluster?
  • How much storage do you need?
  • What cluster type should you deploy?
  • What size and type of virtual machine (VM) should your cluster nodes use?
  • How many worker nodes should your cluster have?

更多细节,你可以参考 HDInsight群集的容量规划 .

For more details, you may refer Capacity planning for HDInsight clusters.

- -------------------------------------------------- --------------------------------------
如果此答案有帮助,请单击标记为答案"或上投票".要提供有关您的论坛体验的其他反馈,请单击 span> .

------------------------------------------------------------------------------------------
If this answer was helpful, click "Mark as Answer" or "Up-Vote". To provide additional feedback on your forum experience, click here.


这篇关于设计Azure HDInsights群集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆