如何在node.js中流式读取目录? [英] how to stream read directory in node.js?

查看:115
本文介绍了如何在node.js中流式读取目录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含100K +甚至500k +文件的目录.我想用fs.readdir读取目录,但是它是异步的而不是流.有人告诉我异步使用内存,然后再读取整个文件列表.

Suppose I have a directory that contains 100K+ or even 500k+ files. I want to read the directory with fs.readdir, but it's async not stream. Someone tell me that async use memory before done read the entire file list.

那么解决方案是什么?我想用流方法读取dir.我可以吗?

So what is the solution? I want to readdir with stream approach. Can I?

推荐答案

在现代计算机中,遍历具有500K文件的目录是什么.在Node.js中异步fs.readdir时,它只是读取指定目录中的文件名列表.它不读取文件的内容.我刚刚在目录中测试了700K文件.只需21MB的内存即可加载此文件名列表.

In modern computers traversing a directory with 500K files is nothing. When you fs.readdir asynchronously in Node.js, what it does is just read a list of file names in the specified directory. It doesn't read the files' contents. I've just tested with 700K files in the dir. It takes only 21MB of memory to load this list of file names.

一旦加载了此文件名列表,您就可以通过设置一些并发限制来逐个或并行遍历它们,并且可以轻松使用它们.示例:

Once you've loaded this list of file names, you just traverse them one by one or in parallel by setting some limit for concurrency and you can easily consume them all. Example:

var async = require('async'),
    fs = require('fs'),
    path = require('path'),
    parentDir = '/home/user';

async.waterfall([
    function (cb) {
        fs.readdir(parentDir, cb);
    },
    function (files, cb) {
        // `files` is just an array of file names, not full path.

        // Consume 10 files in parallel.
        async.eachLimit(files, 10, function (filename, done) {
            var filePath = path.join(parentDir, filename);

            // Do with this files whatever you want.
            // Then don't forget to call `done()`.
            done();
        }, cb);
    }
], function (err) {
    err && console.trace(err);

    console.log('Done');
});

这篇关于如何在node.js中流式读取目录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆