配置以加速分布式模式下的拓扑 [英] Configuration to speed up the topology in distributed mode

查看:69
本文介绍了配置以加速分布式模式下的拓扑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

I have a topology running with parallelism as (1,8,1)(spout,logic bolt, write bolt) with number of ackers set as 12( 12 are available slots in my cluster). The max spout pending is 200 and timeout.secs is 200. I have to process 14 lac inputs.

My cluster consist of 1 nimbus & 3 supervisors ( dual core , 4 GB each) SO currently it takes 44 hours to give the output for 14 lac inputs.

My application is running separately as an application on another server. I did this to separate the application from the storm jar because I suspected that application was reserving some memory. I ran 50 instances of the application so it can process 50 tuples at a time But even this didnt help.

with this configuration, the topology processes around 6000/6500 tuples in 10 mins Messages are not failing anywhere but overall latency 16000 ms. If i try increasing the paralelism the rate of topology reduces(4000/ 3000 in 10 mins).

The behaviour is not constanst so kindly help me to do the math.





我的尝试:



我已经在我的问题中提到了一切



What I have tried:

I have mentioned everything in my question

推荐答案

问题可以是任何地方。你必须定义/搜索瓶颈在哪里。



由于布线不良或开关不良,您的网络可能处于降级模式。

由于内存不足,计算机可能会变慢。

您的程序可能会人为地复杂化或未经过优化。



它可以是任何东西并且你是唯一可以进行测试以查看问题所在的人。

解决这个问题是一项专业工作,因为原因可能非常复杂。
Problem can be anywhere. You have to define/search where is the bottle neck.

Your network can be in downgraded mode because of bad wiring or bad switch.
Computers can be slowed down because of lack of memory.
Your programs can be artificially complicated or not optimized.

It can be anything and you are the only one that can do tests to see where is the problem.
Solving this problem is a specialized job because the reason can be pretty complicated.


这篇关于配置以加速分布式模式下的拓扑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆