Grails Quartz2作业随机停止 [英] Grails Quartz2 job stops randomly

查看:162
本文介绍了Grails Quartz2作业随机停止的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到的情况是,我的Grails应用程序中的某个作业在没有任何明确原因的情况下停止运行。没有异常抛出。我们正在使用Grails 2.2.3和Quartz2插件。有趣的是,所有其他的工作继续运行;只有一个特定的工作会一次又一次地冻结。这项工作调用了第三方REST API调用,有时会给出非常延迟的响应,在少数情况下也没有响应。所有工作都是并发= false 。有人能指引我走向正确的方向吗?我一直在努力解决这个问题已经两天了。很少有我尝试过的事情:


  1. 更改/简化作业处理任务的实现。该工作仍然会进行REST API调用。有时候响应时间非常长(最多20分钟),我们遇到ConnectionTimeOut异常的次数更少。

  2. 启用石英记录。作业冻结,日志记录不会给出任何错误消息。

  3. 安装了Grails石英监视器插件。我们已将它内联并调整为使用Quartz2插件运行。它只是显示了常用的石英/列表。

目前还无法解决问题,并且现在已经没有想法。是否有人遇到过这种情况,并有一些提示可供分享。注意:现在我们已经删除了对第三方REST API的调用,因为这段时间太长,无法查看作业是否长时间运行良好。我猜服务器有时会杀死那些花费太长时间或定期超时的进程。

解决方案

我们已经能够解决这个问题了谜语。问题在于,对其中一个第三方服务器的API调用没有得到响应,长达40-50分钟,之后服务器将超时并关闭连接。我们在工作的每一次运行中都使用了多线程,并且由于某些错误的实现,它并没有给我们一个真正的 concurrent = false 行为;所以在某种程度上,我们有成千上万的开放式连接到这个第三方服务器,对于大多数请求根本没有响应(40-50分钟)。这只是我们的猜测,一段时间后,这个特定的工作/调度程序冻结。



我们能够找到解决问题的两种解决方案:


  1. 使用我们的传出API请求实现更短的连接超时和读取超时。请阅读连接超时读取超时之间的区别 此处。这里是我们写的代码:

    URL url = new URL(urlString)

    HttpURLConnection httpURLConnection =(HttpURLConnection)url.openConnection()
    httpURLConnection.setConnectTimeout(5 * 1000 * 60)

    httpURLConnection.setReadTimeout(8 * 1000 * 60)


  2. 我们能够成功测试的第二个解决方案是通过从Linux调用我们的应用程序的操作/ url来调用API调用 crontab 实用程序。我们所做的就是在我们的应用中创建一个特定的URL,然后这个URL会调用第三方API,所以我们从我们的应用中删除了整个quartz scheduler / plugin依赖项,也就是说我们在这里不使用quartz调度器案件。这种方法唯一的缺点是我们从应用程序代码库之外触发REST API调用。因此,如果我们将应用程序的WAR部署到另一台机器上,我们还必须配置Linux crontab。


我们最终实现了第一个解决方案(连接/读取超时解决方案),因为它保留了代码库本身的解决方案(crontab实用程序不可行)。



希望这可以帮助某人,或者让他们指点到哪里看。


I am experiencing a situation where one of the jobs in my Grails app stops running without any explicit reason. There is no exception thrown. We are using Grails 2.2.3 and Quartz2 plugin. Interesting thing is that all the other jobs which keep running; only one particular job keeps freezing time and again. This job makes a call to the 3rd party REST API calls which sometimes give a very delayed responses and also no responses at all in few instances. All the jobs are concurrent = false. Can someone point me to the right direction? It has been two days since I have been struggling to fix this issue. Few of the things that I have tried:

  1. Changed/Simplified the implementation of the task that the job processes. The job still makes REST API call. The response times at times are very large (upto 20 minutes) and on fewer occasions we face ConnectionTimeOut exception.
  2. Enabled the quartz logging. The job freezes and the logging does not give any error message.
  3. Installed the Grails quartz monitor plugin. We have made it inline and tweaked it to run with Quartz2 plugin. It just shows the usual quartz/list.

Have not been able to resolve the issue as yet and am running out of ideas now. Is there someone who has come across such a situation and have some tips to share. Thanks.

NOTE: Right now we have removed the call to the 3rd party REST API which was taking too long to see if the job/s runs fine for extended periods. I guess the server sometimes kills the process which are taking too long or timing out regularly.

解决方案

We have been able to solve this riddle. The problem was that the API calls to one of the third-party servers were not getting responses for up to 40-50 minutes and after that the server would time-out and close the connection. We had used multi-threading within each run of the job and due to some 'buggy' implementation it was not giving us a true 'concurrent=false' behavior; so in a way we have thousands of open-ended connections to this third-party server with no responses coming at all (for 40-50 minutes) for most of the requests. This is just our guess that after a while this particular job/scheduler freezes.

We were able to find out two solutions to the problem:

  1. Implement the shorter connection time-out and the read time-out with our outgoing API requests. Read what is the difference between connection time-out and read time-out here. Here is the code we wrote:

    URL url = new URL(urlString)
    HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection() httpURLConnection.setConnectTimeout(5 * 1000 * 60)
    httpURLConnection.setReadTimeout(8 * 1000 * 60)

  2. Second solution that we were able to successfully test was to make the API calls by calling our app's action/url from the Linux crontab utility. What we did is to hit a particular URL in our app which in turn gets makes an API call to the third-party so in a way we removed the whole quartz scheduler/plugin dependency from our app i.e. we are not using quartz scheduler in this case. The only downside to this approach is that we are triggering the REST API calls from outside of our app code-base. So if we make a WAR of our app a deploy it in some another machine we will have to configure the Linux crontab as well.

We finally implemented the first solution (connection/read time-out solution) because it kept the solution withing the code-base itself (which is not possible in case of crontab utility).

Hope this helps someone or give them pointers where to look at.

这篇关于Grails Quartz2作业随机停止的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆