NodeJS-“套接字挂起"是什么?真的是什么意思? [英] NodeJS - What does "socket hang up" actually mean?

查看:124
本文介绍了NodeJS-“套接字挂起"是什么?真的是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Node和Cheerio构建Web抓取工具,对于某个网站,我遇到以下错误(它仅在该网站上发生,没有其他我尝试抓取的错误.

I'm building a web scraper with Node and Cheerio, and for a certain website I'm getting the following error (it only happens on this one website, no others that I try to scrape.

每次都在不同的位置发生,因此有时是url x引发错误,而其他时候url x很好并且完全是另一个URL:

It happens at a different location every time, so sometimes it's url x that throws the error, other times url x is fine and it's a different url entirely:

    Error!: Error: socket hang up using [insert random URL, it's different every time]

Error: socket hang up
    at createHangUpError (http.js:1445:15)
    at Socket.socketOnEnd [as onend] (http.js:1541:23)
    at Socket.g (events.js:175:14)
    at Socket.EventEmitter.emit (events.js:117:20)
    at _stream_readable.js:910:16
    at process._tickCallback (node.js:415:13)

调试起来非常棘手,我真的不知道从哪里开始.首先,什么是 IS 套接字挂起错误?是404错误还是类似错误?还是仅表示服务器拒绝连接?

This is very tricky to debug, I don't really know where to start. To begin, what IS a socket hang up error? Is it a 404 error or similar? Or does it just mean that the server refused a connection?

我在任何地方都找不到对此的解释!

I can't find an explanation of this anywhere!

以下是(有时)返回错误的代码示例:

function scrapeNexts(url, oncomplete) {
    request(url, function(err, resp, body) {

        if (err) {
            console.log("Uh-oh, ScrapeNexts Error!: " + err + " using " + url);
            errors.nexts.push(url);
        }
        $ = cheerio.load(body);
        // do stuff with the '$' cheerio content here
    });
}

没有直接调用来关闭连接,但是我正在使用 Node Request ,其中(据我所知)使用http.get,所以这不是必需的,如果我错了,请纠正我!

There is no direct call to close the connection, but I'm using Node Request which (as far as I can tell) uses http.get so this is not required, correct me if I'm wrong!

这是导致错误的实际使用中的代码. prodURL和其他变量大多是前面定义的jquery选择器.这使用 async Node库.

EDIT 2: Here's an actual, in-use bit of code that is causing errors. prodURL and other variables are mostly jquery selectors that are defined earlier. This uses the async library for Node.

function scrapeNexts(url, oncomplete) {
    request(url, function (err, resp, body) {

        if (err) {
            console.log("Uh-oh, ScrapeNexts Error!: " + err + " using " + url);
            errors.nexts.push(url);
        }
        async.series([
                function (callback) {
                    $ = cheerio.load(body);
                    callback();
                },
                function (callback) {
                    $(prodURL).each(function () {
                        var theHref = $(this).attr('href');
                        urls.push(baseURL + theHref);
                    });
                    var next = $(next_select).first().attr('href');
                    oncomplete(next);
                }
            ]);
    });
}

推荐答案

有两种情况会抛出socket hang up:

当您作为客户端时,将请求发送到远程服务器,但不会及时收到响应.您的套接字已结束,将引发此错误.您应该捕获此错误并决定如何处理:是否重试该请求,将其排队以备后用,等等.

When you, as a client, send a request to a remote server, and receive no timely response. Your socket is ended which throws this error. You should catch this error and decide how to handle it: whether retry the request, queue it for later, etc.

当您作为服务器(也许是代理服务器)从客户端接收请求,然后开始对其执行操作(或将请求中继到上游服务器)时,在您准备响应之前,客户端决定取消/中止请求.

When you, as a server, perhaps a proxy server, receive a request from a client, then start acting upon it (or relay the request to the upstream server), and before you have prepared the response, the client decides to cancel/abort the request.

此堆栈跟踪显示了客户端取消请求时发生的情况.

This stack trace shows what happens when a client cancels the request.

Trace: { [Error: socket hang up] code: 'ECONNRESET' }
    at ClientRequest.proxyError (your_server_code_error_handler.js:137:15)
    at ClientRequest.emit (events.js:117:20)
    at Socket.socketCloseListener (http.js:1526:9)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:465:12)

http.js:1526:9行指向@Blender上面提到的同一socketCloseListener,尤其是:

Line http.js:1526:9points to the same socketCloseListener mentioned above by @Blender, particularly:

// This socket error fired before we started to
// receive a response. The error needs to
// fire on the request.
req.emit('error', createHangUpError());

...

function createHangUpError() {
  var error = new Error('socket hang up');
  error.code = 'ECONNRESET';
  return error;
}

如果客户端是浏览器中的用户,这是一种典型情况.加载某些资源/页面的请求需要很长时间,并且用户只需刷新页面即可.这样的操作会导致先前的请求被中止,从而导致服务器端抛出此错误.

This is a typical case if the client is a user in the browser. The request to load some resource/page takes long, and users simply refresh the page. Such action causes the previous request to get aborted which on your server side throws this error.

由于此错误是由客户的意愿引起的,因此他们不希望收到任何错误消息.因此,无需将此错误视为严重错误.只是忽略它.出现这种错误的事实鼓励了您的客户端侦听的res套接字在发生此类错误时仍可写但被销毁.

Since this error is caused by the wish of a client, they don't expect to receive any error message. So, no need to consider this error as critical. Just ignore it. This is encouraged by the fact that on such error the res socket that your client listened to is, though still writable, destroyed.

console.log(res.socket.destroyed); //true

因此,除了明确关闭响应对象外,没有任何东西可以发送:

So, no point to send anything, except explicitly closing the response object:

res.end();

但是,您应该做什么,以确保您已将请求中继到上游的代理服务器,是将内部请求中止到上游,表明您对响应不感兴趣,这反过来又会告诉上游服务器停止昂贵的操作.

However, what you should do for sure if you are a proxy server which has already relayed the request to the upstream, is to abort your internal request to the upstream, indicating your lack of interest in the response, which in turn will tell the upstream server to, perhaps, stop an expensive operation.

这篇关于NodeJS-“套接字挂起"是什么?真的是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆