群集不工作 [英] Clustering doesn't work

查看:157
本文介绍了群集不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我配置的集群化使用Apache在前面的两个Tomcat和的mod_jk 作为连接器。我尝试了测试应用程序检查配置和正常工作。会话被成功复制,并成功地检测到故障切换。但是,当我想这对我的实际应用中,这是行不通的。我在httpd.conf因此,非常精心制作的修改。有没有例外,在日志中没有错误。我无法追踪问题。起初我是越来越 NotSerializableException 特定类和我让他们序列化。现在没有什么异常,但仍然我不能,如果托管Tomcat是shutted下来,甚至当群集的其他Tomcat成员还活着加载应用程序。你们能帮帮我。我能理解这是相当艰难的时候你不知道这个问题的产生解决方案。

I config clustering for two tomcat using apache at front and mod_jk as connector. I tried a test application to check the configuration and it works fine. Session are being successfully replicated and failover is detected successfully. But when i tried this for my actual application, it does not work. I made the modification in httpd.conf accordingly and very carefully. There is no exception,no error in the logs. I am unable to track the problem. Initially i was getting NotSerializableException for a particular classes and i made them serializable. Now there is no exception but still i am unable to load the application if the hosting tomcat is shutted down even when the other tomcat member of the cluster are alive. Can you guys please help me. I can understand it is quite tough to produce the solution when you are not sure of the problem.

推荐答案

所以,你必须服2次,以相同方式配置,除了一个正确的故障切换和其他不?

So you have 2 services, configured the same way, except that one fails over correctly and the other doesn't?

有当你看到一些看起来不可能的拇指一般规则。这规则是,你没有看到你认为你所看到的。经常是因为什么被戏称为PEBKAC(存在问题键盘和椅子之间)。真正令人沮丧的是,无论是多么明显,你可以在它的100倍凝视,因为你看到你知道是存在的,而不是还有什么也不会很明显。

There is a general rule of thumb when you're seeing something that looks impossible. And that rule is that you're not seeing what you think you are seeing. Frequently because of what is jokingly referred to as PEBKAC (Problem Exists Between Keyboard And Chair). The really frustrating thing is that, no matter how obvious it is, you can stare at it 100 times and it won't be obvious because you see what you "know" is there rather than what is there.

在我的经验有两个很好的办法来解决这类问题。

In my experience there are two good ways to solve this kind of problem.


  1. 把它带到别人,并要求他们找到自己在做什么不同。因为他们看到的是在那里,不是你知道是存在的,他们往往会看到什么你不能。 (在时间全部课程,你可以返回的青睐某一天。)

  2. 启动与工作配置和非工作之一,并启动平分,直到它们之间的路径获得一个最小差异,告诉的工作和非工作之间的差异。惠特尔的差别下来,你要么知道要解决什么,或者有一个测试的情况下给别人。

赔率是,你需要按照第二种方式。你可能不希望 - 我永远不会做 - 但它通常比你想象的那么痛苦。您可以通过复制一个测试系统上的全面应用,并表明你有相同的故障开始。 (如果你不这样做,那么你就开始寻找,细心,为生产和测试之间的差异。在类似的事情操作系统版本,库版本等特定的外观。)

Odds are that you'll need to follow the second approach. You probably don't want to - I never do - but it usually is less painful than you imagine. You start by replicating the full application on a test system, and demonstrating that you have the same failure. (If you don't, then you start looking, carefully, for differences between production and test. In particular look at things like operating system version, library versions, and the like.)

假设你有一个测试系统,保存配置。然后开始翻录出你想象什么都没有做与你的配置问题,定期测试,你是在正确的道路上的实际应用的大块。 (和节约每你是时间)。一旦你有一个最小的应用,开始尝试走过来向工作测试应用程序。某处,你会发现一个变化有差别。它可以在任何地方。一旦你找到它,你通常会知道如何解决您的生产系统。或者,如果你不这样做,你就会知道你的问题很清楚了。

Assuming that you have a test system, save that configuration. Then start ripping out large chunks of your actual application that you imagine have nothing to do with your configuration problems, testing periodically that you are on the right path. (And saving every time that you are.) Once you have a minimal application, start trying to walk it over towards the working test application. Somewhere you'll find a change that makes a difference. It could be anywhere. Once you have found it, you'll usually know exactly how to fix your production system. Or if you don't, you'll know your problem fairly clearly.

有时你会发现一个奇怪的错误。如果是这样,那么你应该然后开始尝试,直到你有一个很好的bug报告发送给尽可能简化一切。

Sometimes you'll have found a weird bug. If so, then you should then start trying to simplify everything as much as possible until you have a nice bug report to send in.

这篇关于群集不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆