Nodejs:带有 URL 列表的异步请求 [英] Nodejs: Async request with a list of URL

查看:21
本文介绍了Nodejs:带有 URL 列表的异步请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个爬虫.我有一个需要请求的 URL 列表.如果我不将其设置为异步,则会同时有数百个请求.我担心它会爆炸我的带宽或产生对目标网站的大量网络访问.我该怎么办?

I am working on a crawler. I have a list of URL need to be requested. There are several hundreds of request at the same time if I don't set it to be async. I am afraid that it would explode my bandwidth or produce to much network access to the target website. What should I do?

这是我在做什么:

urlList.forEach((url, index) => {

    console.log('Fetching ' + url);
    request(url, function(error, response, body) {
        //do sth for body

    });
});

我希望一个请求完成后调用一个请求.

I want one request is called after one request is completed.

推荐答案

你可以使用类似 Promise 库的东西,例如片段

You can use something like Promise library e.g. snippet

const Promise = require("bluebird");
const axios = require("axios");

//Axios wrapper for error handling
const axios_wrapper = (options) => {
    return axios(...options)
        .then((r) => {
            return Promise.resolve({
                data: r.data,
                error: null,
            });
        })
        .catch((e) => {
            return Promise.resolve({
                data: null,
                error: e.response ? e.response.data : e,
            });
        });
};

Promise.map(
    urls,
    (k) => {
        return axios_wrapper({
            method: "GET",
            url: k,
        });
    },
    { concurrency: 1 } // Here 1 represents how many requests you want to run in parallel
)
    .then((r) => {
        console.log(r);
        //Here r will be an array of objects like {data: [{}], error: null}, where if the request was successfull it will have data value present otherwise error value will be non-null
    })
    .catch((e) => {
        console.error(e);
    });

这篇关于Nodejs:带有 URL 列表的异步请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆