Node.js请求随机开始挂起,直到服务器重新启动后才会清除 [英] Node.js requests randomly begin to hang and won't clear until server restart

查看:114
本文介绍了Node.js请求随机开始挂起,直到服务器重新启动后才会清除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我们的网络应用程序中遇到了一个非常奇怪且看似随机的问题,但似乎无法成功调试.它可以在10分钟到6个小时的任何时间正常运行,然后突然无法向服务器发出或从服务器发出远程请求,它们只是挂起了(这包括常规的HTTP和Web套接字请求).奇怪的是,直到达到OS文件描述符限制,然后所有停滞的连接使http完全崩溃,才能正常访问该站点.

I've been running into a really odd and seemingly random issue on our web app that I just can't seem to successfully debug. It runs fine for anywhere from 10 minutes to 6 hours, and then all of a sudden no remote requests to or from the server can be made, they just hang (this includes regular http and web socket requests). The odd thing is that going to the site regularly still works, until the OS file descriptor limit is reached and then http completely crashes with all of the stalled connections.

没有错误,尽管在问题开始时会引发以下错误(我认为这是发生的一切的副作用,而不是原因).

There are no errors, though the following error is thrown when the issue begins (I assume this is a side-effect of whatever is going on rather than the cause).

TypeError: Cannot read property '0' of null
    at null.<anonymous> (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/collection.js:504:22)
    at args.(anonymous function) (/app/node_modules/strong-agent/lib/proxy.js:85:18)
    at g (events.js:175:14)
    at EventEmitter.emit (events.js:98:17)
    at Base.__executeAllServerSpecificErrorCallbacks (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/base.js:315:29)
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:273:22
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:370:11
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:352:28
    at _callback (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/db.js:670:5)
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/auth/mongodb_cr.js:47:13

我尝试提高文件描述符限制和全局代理maxSockets,而对此行为没有影响.发生这种情况时,不会有流量涌入,在高峰和非高峰时间也是如此. CPU使用率始终保持在5%以下,并且在崩溃之前或崩溃期间没有任何明显的变化.服务器也永远不会低于1GB的可用内存.

I've tried raising the file descriptor limits and the global agent maxSockets with no affect on this behavior. There's no influx of traffic when this happens, and it happens equally as often during peak and off-peak times. The CPU usage consistently stays below 5% and doesn't have any perceptible changes leading up to or during the crash. The server also never drops below 1GB of free memory.

堆栈:SmartOS云服务器(Joyent),Express,Socket.io,MongoDB和Redis.

The stack: SmartOS cloud server (Joyent), Express, Socket.io, MongoDB and Redis.

我已经调试了好几天,已经完全没有了要看的地方的主意了.希望有人这样会遇到类似的事情,或者对可以尝试或测试的事情有不同的想法.

I've been debugging this for several days and have completely run out of ideas where to look. Hoping someone on SO has run into something similar or has different ideas of what can be tried or tested.

推荐答案

经过无数小时的调试和更多调试之后,我终于找到了罪魁祸首.在几个不同的mongojs回调中引发了一个错误,该回调似乎已经冒泡并阻止了连接的关闭.随着时间的流逝,这到达了一个临界点,连接开始挂起,直到达到文件描述符限制.

After countless hours of debugging and more debugging, I finally found the culprit. An error was being thrown inside of several different mongojs callbacks, which appears to have bubbled up and blocked the connections from closing. Over time, this got to a tipping point and connections started hanging until the file descriptor limit was reached.

该错误原来是在Now.js节点模块中(已被放弃).如果有任何人使用Now.js遇到此问题,我将其分叉并修复了该错误.您可以在此处找到提交: https://github.com/goldfire/now/commit/b5bd54f8950602f752a710c606be6754b759cab2 .

The error turned out to be in the Now.js node module (which has been abandoned). If there is anyone out there that is running into this issue using Now.js, I forked it and patched the bug. You you can find the commit here: https://github.com/goldfire/now/commit/b5bd54f8950602f752a710c606be6754b759cab2.

我发现此错误的方法是将错误侦听器附加到数据库对象:

The way I found this bug was to attach an error listener to the DB object:

var db = require('mongojs').connect('...', ['collection']);
db.client.on('error', function(err){
  console.log(err.stack);
});

这篇关于Node.js请求随机开始挂起,直到服务器重新启动后才会清除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆