NodeJS - “套接字挂断"是什么意思?其实是什么意思? [英] NodeJS - What does "socket hang up" actually mean?

查看:32
本文介绍了NodeJS - “套接字挂断"是什么意思?其实是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 Node 和 Cheerio 构建一个网络抓取工具,对于某个网站,我收到以下错误(它只发生在这个网站上,我没有尝试抓取其他网站.

I'm building a web scraper with Node and Cheerio, and for a certain website I'm getting the following error (it only happens on this one website, no others that I try to scrape.

它每次都发生在不同的位置,所以有时是 url x 抛出错误,其他时候 url x 很好,它是一个完全不同的 url:

It happens at a different location every time, so sometimes it's url x that throws the error, other times url x is fine and it's a different url entirely:

    Error!: Error: socket hang up using [insert random URL, it's different every time]

Error: socket hang up
    at createHangUpError (http.js:1445:15)
    at Socket.socketOnEnd [as onend] (http.js:1541:23)
    at Socket.g (events.js:175:14)
    at Socket.EventEmitter.emit (events.js:117:20)
    at _stream_readable.js:910:16
    at process._tickCallback (node.js:415:13)

这个调试起来很棘手,我真的不知道从哪里开始.首先,什么套接字挂断错误?是 404 错误还是类似错误?还是仅仅意味着服务器拒绝了连接?

This is very tricky to debug, I don't really know where to start. To begin, what IS a socket hang up error? Is it a 404 error or similar? Or does it just mean that the server refused a connection?

我在任何地方都找不到对此的解释!

I can't find an explanation of this anywhere!

这是(有时)返回错误的代码示例:

function scrapeNexts(url, oncomplete) {
    request(url, function(err, resp, body) {

        if (err) {
            console.log("Uh-oh, ScrapeNexts Error!: " + err + " using " + url);
            errors.nexts.push(url);
        }
        $ = cheerio.load(body);
        // do stuff with the '$' cheerio content here
    });
}

没有关闭连接的直接调用,但我使用的是节点请求(据我所知)使用 http.get 所以这不是必需的,如果我错了,请纠正我!

There is no direct call to close the connection, but I'm using Node Request which (as far as I can tell) uses http.get so this is not required, correct me if I'm wrong!

编辑 2:这是导致错误的实际使用中的代码位.prodURL 等变量大多是之前定义的jquery选择器.这使用 Node 的 async 库.

EDIT 2: Here's an actual, in-use bit of code that is causing errors. prodURL and other variables are mostly jquery selectors that are defined earlier. This uses the async library for Node.

function scrapeNexts(url, oncomplete) {
    request(url, function (err, resp, body) {

        if (err) {
            console.log("Uh-oh, ScrapeNexts Error!: " + err + " using " + url);
            errors.nexts.push(url);
        }
        async.series([
                function (callback) {
                    $ = cheerio.load(body);
                    callback();
                },
                function (callback) {
                    $(prodURL).each(function () {
                        var theHref = $(this).attr('href');
                        urls.push(baseURL + theHref);
                    });
                    var next = $(next_select).first().attr('href');
                    oncomplete(next);
                }
            ]);
    });
}

推荐答案

socket hang up被抛出有两种情况:

当您作为客户端向远程服务器发送请求但没有及时收到响应时.您的套接字已结束,这会引发此错误.您应该捕获此错误并决定如何处理它:是否重试请求、将其排队以供稍后使用等.

When you, as a client, send a request to a remote server, and receive no timely response. Your socket is ended which throws this error. You should catch this error and decide how to handle it: whether retry the request, queue it for later, etc.

当您作为服务器(可能是代理服务器)收到来自客户端的请求,然后开始对其执行操作(或将请求中继到上游服务器),并且在您准备好响应之前,客户端决定取消/中止请求.

When you, as a server, perhaps a proxy server, receive a request from a client, then start acting upon it (or relay the request to the upstream server), and before you have prepared the response, the client decides to cancel/abort the request.

此堆栈跟踪显示客户端取消请求时会发生什么.

This stack trace shows what happens when a client cancels the request.

Trace: { [Error: socket hang up] code: 'ECONNRESET' }
    at ClientRequest.proxyError (your_server_code_error_handler.js:137:15)
    at ClientRequest.emit (events.js:117:20)
    at Socket.socketCloseListener (http.js:1526:9)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:465:12)

http.js:1526:9指向@Blender提到的同一个socketCloseListener,特别是:

Line http.js:1526:9points to the same socketCloseListener mentioned by @Blender, particularly:

// This socket error fired before we started to
// receive a response. The error needs to
// fire on the request.
req.emit('error', createHangUpError());

...

function createHangUpError() {
  var error = new Error('socket hang up');
  error.code = 'ECONNRESET';
  return error;
}

如果客户端是浏览器中的用户,这是一个典型的情况.加载某些资源/页面的请求需要很长时间,用户只需刷新页面即可.此类操作会导致上一个请求中止,从而在您的服务器端引发此错误.

This is a typical case if the client is a user in the browser. The request to load some resource/page takes long, and users simply refresh the page. Such action causes the previous request to get aborted which on your server side throws this error.

由于此错误是由客户的意愿引起的,因此他们不希望收到任何错误消息.因此,无需将此错误视为关键.忽略它.令人鼓舞的是,在发生此类错误时,您的客户端侦听的 res 套接字虽然仍可写,但会被销毁.

Since this error is caused by the wish of a client, they don't expect to receive any error message. So, no need to consider this error as critical. Just ignore it. This is encouraged by the fact that on such error the res socket that your client listened to is, though still writable, destroyed.

console.log(res.socket.destroyed); //true

因此,除了明确关闭响应对象外,没有必要发送任何内容:

So, no point to send anything, except explicitly closing the response object:

res.end();

但是,如果您代理服务器并且已经将请求中继到上游,那么您应该做的是中止对上游的内部请求,表明您对响应缺乏兴趣,这反过来会告诉上游服务器,也许,停止一个昂贵的操作.

However, what you should do for sure if you are a proxy server which has already relayed the request to the upstream, is to abort your internal request to the upstream, indicating your lack of interest in the response, which in turn will tell the upstream server to, perhaps, stop an expensive operation.

这篇关于NodeJS - “套接字挂断"是什么意思?其实是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆