如何扩展分布式Erlang? [英] How scalable is distributed Erlang?

查看:163
本文介绍了如何扩展分布式Erlang?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

部分A:



Erlang在运行并发代理方面有很多成功案例,例如数百万同时进行的Facebook聊天。这是数百万的代理商,但当然,网络上的CPU并不是数百万。在LAN / WAN上扩展水平时,我无法找到关于Erlang缩放程度的指标。



我们假设我有很多(数万)物理节点(在Linux上运行Erlang)需要通过LAN / WAN通信和同步少量少量数据。在什么时候我会有通信瓶颈,而不是在代理之间,而是在物理节点之间? (或者这个工作是否正常,假设一个稳定的网络?)



B部分:



我明白(作为一个Erlang新手,意思是我可能完全错误),Erlang节点试图将所有连接和互相了解,导致N ^ 2连接点对点网络。假设A部分不仅仅适用于N = 10K,那么Erlang可以轻松配置(使用开箱即用的配置或简单的样板,而不是自己编写一个完整的分组/路由算法实现)将集群节点配置成可管理的通过集群/组层次结构组和路由系统范围的消息?

解决方案

我们应该说明我们谈论物理的水平可伸缩性机器 - 这是唯一的问题。一台机器上的CPU将由一台VM处理,无论数量如何。



node = machine。



首先,我可以说30-60个节点,您可以使用任何自定义应用程序(在Erlang中打开)(Vanilla OTP安装) )。证明:ejabberd。



〜100-150可以通过优化的自定义应用程序。我的意思是,它必须是一个很好的代码,写有关于GC的知识,数据类型的特征,消息传递等。



超过+150是正确的,但当我们谈话关于300,500的数字,它将需要优化& TCP层的自定义。此外,我们的应用程序必须了解成本。跨群集同步呼叫。



另一件事是DB层。 Mnesia(内置)由于其功能将不会超过20个节点(我的经验 - 我可能是错的)。解决方案:只需使用别的东西:dynamo DB,单独的MySQL集群,HBase等。



最常见的技术是利用创建高质量应用程序和可扩展性的成本, 〜20-50个节点集群。所以在内部它是一个高效的网格,约50个erlang节点,并通过任何合适的协议与另外50个节点集群相连。总而言之,这种系统是N erlang集群的联合。



分布式erlang旨在运行在一个数据中心。如果您需要更多的地理位置远的节点,则使用联合。



有许多配置选项,例如它们不将所有节点连接到彼此。这可能是有帮助的,但是在〜50集群的erlang开销并不重要。此外,您可以使用隐藏连接创建一个erlang节点的图形,它不会加入这个完整的网格,但也不能从连接到所有节点中获益。



在这种系统中,我看到的最大的问题是将其设计为无主控系统。如果你不需要,一切都应该是好的。


Part A:

Erlang has a lot of success stories about running concurrent agents e.g. the millions of simultaneous Facebook chats. That's millions of agents, but of course it's not millions of CPUs across a network. I'm having trouble finding metrics on how well Erlang scales when scaling is "horizontal" across a LAN/WAN.

Let's assume that I have many (tens of thousands) physical nodes (running Erlang on Linux) that need to communicate and synchronize small infrequent amounts of data across the LAN/WAN. At what point will I have communications bottlenecks, not between agents, but between physical nodes? (Or will this just work, assuming a stable network?)

Part B:

I understand (as an Erlang newbie, meaning I could be totally wrong) that Erlang nodes attempt to all connect to and be aware of each other, resulting in an N^2 connection point-to-point network. Assuming that part A won't just work with N = 10K's, can Erlang be configured easily (using out-of-the-box config or trivial boilerplate, not writing a full implementation of grouping/routing algorithms myself) to cluster nodes into manageable groups and route system -wide messages through the cluster/group hierarchy?

解决方案

We should specify that we talk about horizontal scalability of physical machines -- that's the only problem. CPUs on one machine will be handled by one VM, no matter what the number of those is.

node = machine.

To begin, I can say that 30-60 nodes you get out of the box (vanilla OTP installation) with any custom application written on the top of that (in Erlang). Proof: ejabberd.

~100-150 is possible with optimized custom application. I means, it has to be good code, written with knowledge about GC, characteristic of data types, message passing etc.

over +150 is all right but when we talk about numbers like 300, 500 it will require optimizations & customizations of TCP layer. Also, our app has to be aware of cost of e.g. sync calls across the cluster.

The other thing is DB layer. Mnesia (built-in) due its features will not be effective over 20 nodes (my experience - I may be wrong). Solution: just use something else: dynamo DBs, separate cluster of MySQLs, HBase etc.

The most common technique to leverage cost of creating high quality application and scalability are federations of ~20-50 nodes clusters. So internally its an efficient mesh of ~50 erlang nodes and its connected via any suitable protocol with N another 50 nodes clusters. To sum up, such a system is federation of N erlang clusters.

Distributed erlang is designed to run in one data center. If you need more, geographically distant nodes, then use federations.

There are lots of config options e.g. which do not connect all nodes to each other. It may be helpful, however in ~50 cluster erlang overhead is not significant. Also you can create a graph of erlang nodes using 'hidden' connection, which doesn't join this full mesh, but also it cannot benefit from connection to all nodes.

The biggest problem I see, in this kind of systems, is designing it as master-less system. If you do not need that, everything should be ok.

这篇关于如何扩展分布式Erlang?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆