正在连续HTTP请求节点阻塞操作? [英] Is making sequential HTTP requests a blocking operation in node?

查看:596
本文介绍了正在连续HTTP请求节点阻塞操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要注意的是不相关的信息到我的问题将在报价


  

像这样(随意跳过这些)。


问题

我使用的节点,使代表多个客户端的顺序HTTP请求。这样一来,原来是什么了客户端(S)几个不同的页面加载,以获得期望的结果,现在只需要通过我的服务器一个请求。我目前使用的流量控制和使HTTP请求请求模块异步模块。大约有5回调当中,使用console.time,大约需要约2秒,从开始到结束(包括下面的草图code)。


  

现在我没有经验,有节点,但是我知道的
  单线程节点的性质。虽然我看过很多次这个节点
  没有为CPU密集型任务而建,我真的不明白是什么
  意味着到现在。如果我有一个正确的认识这是怎么回事,
  这意味着什么,我目前有(开发中)不以任何方式
  要扩展到甚至超过10个客户端。


问题

因为我不是在节点的专家,我问这个问题(标题)得到确认,使得几个连续的HTTP请求,确实阻塞。

后记

如果是这样的话,我希望我会问不同的SO问题讨论各种可能的解决方案(做适当的研究后),我应该选择继续节点处理这个问题(其本身可能不适合我M试图做的)。

其他关闭的想法

我真的很抱歉,如果没有详细的这个问题已经足够了,太noobish,或有特别华丽的语言(我尽量简洁)。

感谢所有的upvotes的人谁可以帮助我与我的问题!

在code我前面提到的:

  VAR异步=要求('异步');
VAR请求=要求(请求);...async.waterfall([
    功能(CB){
        console.time('1');        请求(someUrl1,功能(呃,RES体){
            //加载和分析给定的网页。            //使与网页分析数据的回调
        });
    },
    功能(someParameters,CB){
        console.timeEnd('1');
        console.time('2');        请求(网址:{url:someUrl2,方法:POST,形式为:{/ * *数据/}},功能(呃,RES体){
            //更多的计算            //使回调由访问过的URL给定的会话cookie
        });
    },
    功能(罐子,CB){
        console.timeEnd('2');
        console.time('3');        请求(网址:{url:someUrl3,方法:GET,罐子:罐子/ *从previous回调* /}饼干,功能(呃,RES体){
            //做更多的分析计算+            //做的结果另一个回调
        });
    },
    功能(moreParameters,CB){
        console.timeEnd('3');
        console.time('4');        请求(网址:{url:someUrl4,方法:POST,罐子:罐子,形式为:{/ * *数据/}},功能(呃,RES体){
            //作出最后的回调后,一些更多的计算。
            //这部分大约需要1秒〜完成
        });
    }
],功能(错了,结果){
    console.timeEnd('4'); //
    res.status(200)。发送();
});


解决方案

通常情况下,I / Node.js的中O进行无阻塞。您可以通过同时做几个请求到服务器测试了这一点。例如,如果每个请求需要1秒来处理,阻塞服务器需要2秒来处理2个同步请求,但非阻塞服务器将采取只是有点超过1秒同时处理的请求。

不过,你可以故意提出请求使用同步申请模块,而不是阻塞< A HREF =htt​​ps://github.com/request/request相对=nofollow>申请。很显然,这是不建议用于服务器。

下面是一个有点code,以证明阻塞与非阻塞的区别I / O:

  VAR REQ =要求(请求);
VAR同步=需要('同步请求');//加载example.com N次(是的,这是一个真正的网站):
变种N = 10;的console.log('阻断试验==========');
VAR开始=新的日期()的valueOf()。
对于(VAR I = 0; I&LT; N;我++){
    VAR解析度=同步(GET,HTTP://www.example.com中')
    的console.log('下载'+ res.getBody()长度+'字节');
}
VAR结束=新的Date()的valueOf()。
的console.log('总时间:'+(年底启动)+'MS');的console.log('非阻塞测试==');
无功负载= 0;
VAR开始=新的日期()的valueOf()。
对于(VAR I = 0; I&LT; N;我++){
    REQ('HTTP://www.example.com',function(ERR,响应体){
        加载++;
        的console.log('下载'+ body.length +'字节');
        如果(装== N){
            VAR结束=新的Date()的valueOf()。
            的console.log('总时间:'+(年底启动)+'MS');
        }
    })
}

运行code以上,你会看到非阻塞测试大致需要的时间来处理所有的请求,因为它用于一个请求(例如,如果你设定N = 10,非同一数额-blocking code比执行阻塞code的10倍)。这清楚地表明,请求是非阻塞的。


其他答案:

您也提到,你担心你的过程是CPU密集型。但是,在你的code,你不是CPU基准测试工具。你既混合网络请求时间(I / O,这是我们所知道的是非阻塞)和CPU处理时间。多少来衡量时间的要求是在阻塞模式下,你的code改成这样:

  async.waterfall([
    功能(CB){
        请求(someUrl1,功能(呃,RES体){
            console.time('1');
            //加载和分析给定的网页。
            console.timeEnd('1');
            //使与网页分析数据的回调
        });
    },
    功能(someParameters,CB){
        请求(网址:{url:someUrl2,方法:POST,形式为:{/ * *数据/}},功能(呃,RES体){
            console.time('2');
            //更多的计算
            console.timeEnd('2');            //使回调由访问过的URL给定的会话cookie
        });
    },
    功能(罐子,CB){
        请求(网址:{url:someUrl3,方法:GET,罐子:罐子/ *从previous回调* /}饼干,功能(呃,RES体){
            console.time('3');
            //做更多的分析计算+
            console.timeEnd('3');
            //做的结果另一个回调
        });
    },
    功能(moreParameters,CB){
        请求(网址:{url:someUrl4,方法:POST,罐子:罐子,形式为:{/ * *数据/}},功能(呃,RES体){
            console.time('4');
            //一些更多的计算。
            console.timeEnd('4');            //作出最后回调
        });
    }
],功能(错了,结果){
    res.status(200)。发送();
});

您code块只在更多的计算部分。所以,你可以完全忽略花费在等待其他部分执行任何时间。事实上,这正是节点如何能够兼任多个请求。在等待其他部分来调用相应的回调(你提到,它可能需要1秒),节点可以执行其他JavaScript code和处理其他请求。

Note that irrelevant information to my question will be 'quoted'

like so (feel free to skip these).

Problem

I am using node to make in-order HTTP requests on behalf of multiple clients. This way, what originally took the client(s) several different page loads to get the desired result, now only takes a single request via my server. I am currently using the ‘async’ module for flow control and ‘request’ module for making the HTTP requests. There are approximately 5 callbacks which, using console.time, takes about ~2 seconds from start to finish (sketch code included below).

Now I am rather inexperienced with node, but I am aware of the single-threaded nature of node. While I have read many times that node isn’t built for CPU-bound tasks, I didn’t really understand what that meant until now. If I have a correct understanding of what’s going on, this means that what I currently have (in development) is in no way going to scale to even more than 10 clients.

Question

Since I am not an expert at node, I ask this question (in the title) to get a confirmation that making several sequential HTTP requests is indeed blocking.

Epilogue

If that is the case, I expect I will ask a different SO question (after doing the appropriate research) discussing various possible solutions, should I choose to continue approaching this problem in node (which itself may not be suitable for what I'm trying to do).

Other closing thoughts

I am truly sorry if this question was not detailed enough, too noobish, or had particularly flowery language (I try to be concise).

Thanks and all the upvotes to anyone who can help me with my problem!

The code I mentioned earlier:

var async = require('async');
var request = require('request');

...

async.waterfall([
    function(cb) {
        console.time('1');

        request(someUrl1, function(err, res, body) {
            // load and parse the given web page.

            // make a callback with data parsed from the web page
        });
    },
    function(someParameters, cb) {
        console.timeEnd('1');
        console.time('2');

        request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
            // more computation

            // make a callback with a session cookie given by the visited url
        });
    },
    function(jar, cb) {
        console.timeEnd('2');
        console.time('3');

        request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
            // do more parsing + computation

            // make another callback with the results
        });
    },
    function(moreParameters, cb) {
        console.timeEnd('3');
        console.time('4');

        request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
            // make final callback after some more computation.
            //This part takes about ~1s to complete
        });
    }
], function (err, result) {
    console.timeEnd('4'); //
    res.status(200).send();
});

解决方案

Normally, I/O in node.js are non-blocking. You can test this out by making several requests simultaneously to your server. For example, if each request takes 1 second to process, a blocking server would take 2 seconds to process 2 simultaneous requests but a non-blocking server would take just a bit more than 1 second to process both requests.

However, you can deliberately make requests blocking by using the sync-request module instead of request. Obviously, that's not recommended for servers.

Here's a bit of code to demonstrate the difference between blocking and non-blocking I/O:

var req = require('request');
var sync = require('sync-request');

// Load example.com N times (yes, it's a real website):
var N = 10;

console.log('BLOCKING test ==========');
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
    var res = sync('GET','http://www.example.com')
    console.log('Downloaded ' + res.getBody().length + ' bytes');
}
var end = new Date().valueOf();
console.log('Total time: ' + (end-start) + 'ms');

console.log('NON-BLOCKING test ======');
var loaded = 0;
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
    req('http://www.example.com',function( err, response, body ) {
        loaded++;
        console.log('Downloaded ' + body.length + ' bytes');
        if (loaded == N) {
            var end = new Date().valueOf();
            console.log('Total time: ' + (end-start) + 'ms');
        }
    })
}

Running the code above you'll see the non-blocking test takes roughly the same amount of time to process all requests as it does for a single request (for example, if you set N = 10, the non-blocking code executes 10 times faster than the blocking code). This clearly illustrates that the requests are non-blocking.


Additional answer:

You also mentioned that you're worried about your process being CPU intensive. But in your code, you're not benchmarking CPU utility. You're mixing both network request time (I/O, which we know is non-blocking) and CPU process time. To measure how much time the request is in blocking mode, change your code to this:

async.waterfall([
    function(cb) {
        request(someUrl1, function(err, res, body) {
            console.time('1');
            // load and parse the given web page.
            console.timeEnd('1');
            // make a callback with data parsed from the web page
        });
    },
    function(someParameters, cb) {
        request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
            console.time('2');
            // more computation
            console.timeEnd('2');

            // make a callback with a session cookie given by the visited url
        });
    },
    function(jar, cb) {
        request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
            console.time('3');
            // do more parsing + computation
            console.timeEnd('3');
            // make another callback with the results
        });
    },
    function(moreParameters, cb) {
        request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
            console.time('4');
            // some more computation.
            console.timeEnd('4');

            // make final callback
        });
    }
], function (err, result) {
    res.status(200).send();
});

Your code only blocks in the "more computation" parts. So you can completely ignore any time spent waiting for the other parts to execute. In fact, that's exactly how node can serve multiple requests concurrently. While waiting for the other parts to call the respective callbacks (you mention that it may take up to 1 second) node can execute other javascript code and handle other requests.

这篇关于正在连续HTTP请求节点阻塞操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆