使用Node.js实时读取文件 [英] Reading a file in real-time using Node.js

查看:188
本文介绍了使用Node.js实时读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用node.js实时读取正在写入文件的数据的最佳方法。麻烦的是,Node是一艘快速发展的船,它使寻找解决问题的最佳方法变得困难。

I need to work out the best way to read data that is being written to a file, using node.js, in real time. Trouble is, Node is a fast moving ship which makes finding the best method for addressing a problem difficult.

我想做什么

我有一个正在做某事的java进程,然后将它做的事情的结果写入文本文件。它通常需要5分钟到5个小时才能运行,数据是全时写入的,并且可以达到一些相当高的吞吐率(大约1000行/秒)。

What I Want To Do
I have a java process that is doing something and then writing the results of this thing it does to a text file. It typically takes anything from 5 mins to 5 hours to run, with data being written the whole time, and can get up to some fairly hefty throughput rates (circa. 1000 lines/sec).

我想实时阅读这个文件,然后使用节点聚合数据并将其写入套接字,可以在客户端上绘制图形。

I would like to read this file, in real time, and then, using node aggregate the data and write it to a socket where it can be graphed on the client.

客户端,图形,套接字和聚合逻辑都已完成,但我对阅读文件的最佳方法感到困惑。

The client, graphs, sockets and aggregation logic are all done but I am confused about the best approach for reading the file.

我是什么已尝试(或至少玩过)

FIFO - 我可以告诉我的Java进程写入fifo并阅读此内容使用node,这实际上是我们现在如何使用Perl实现的,但是因为其他所有东西都在节点中运行,所以将代码移植到它是有意义的。

What I Have Tried (or at least played with)
FIFO - I can tell my Java process to write to a fifo and read this using node, this is in fact how we have this currently implemted using Perl, but because everything else is running in node it makes sense to port the code over.

Unix套接字 - 如上所述。

fs.watchFile - 将这项工作是否符合我们的要求?

fs.watchFile - will this work for what we need?

fs.createReadStream - 这比watchFile好吗?

fs.createReadStream - is this better than watchFile?

fs & tail -f - 看起来像是黑客。

fs & tail -f - seems like a hack.

实际上,我的问题是什么

我倾向于使用Unix套接字,这似乎是最快的选择。但节点是否具有更好的内置功能,可以实时读取fs中的文件?

What, actually, is my Question
I am tending towards using Unix Sockets, this seems the fastest option. But does node have better built-in features for reading files from the fs in real time?

推荐答案

如果你想保留将文件作为数据的持久存储,以防止在系统崩溃或网络中正在运行的进程中的某个成员死亡时丢失流,您仍然可以继续写入文件并从中读取。

If you want to keep the file as a persistent store of your data to prevent a loss of stream in case of a system crash or one of the members in your network of running processes dies, you can still continue on writing to a file and reading from it.

如果您不需要将此文件作为Java进程生成结果的持久存储,那么使用Unix套接字对于简单性和性能。

If you do not need this file as a persistent storage of produced results from your Java process, then going with a Unix socket is much better for both the ease and also the performance.

fs.watchFile()不是你需要的,因为它适用于文件统计数据,因为文件系统报告它并且因为你想要读取已经写好的文件,所以这不是你想要的。

fs.watchFile() is not what you need because it works on file stats as filesystem reports it and since you want to read the file as it is already being written, this is not what you want.

简短更新:我非常很遗憾地发现虽然我曾指责 fs.watchFile()使用上一段中的文件统计信息,但我在下面的示例代码中自己做了同样的事情!虽然我已经警告读者要小心!因为我在几分钟内写完了它,甚至没有测试好;仍然可以通过使用 fs.watch()代替 watchFile 来做得更好fstatSync 如果底层系统支持它。

SHORT UPDATE: I am very sorry to realize that although I had accused fs.watchFile() for using file stats in previous paragraph, I had done the very same thing myself in my example code below! Although I had already warned readers to "take care!" because I had written it in just a few minutes without even testing well; still, it can be done better by using fs.watch() instead of watchFile or fstatSync if underlying system supports it.

对于从文件读取/写入,我刚刚在下面写了以下内容以获得乐趣:

For reading/writing from a file, I have just written below for fun in my break:

test-fs-writer.js :[您在Java过程中编写文件后不需要这样做]

test-fs-writer.js: [You will not need this since you write file in your Java process]

var fs = require('fs'),
    lineno=0;

var stream = fs.createWriteStream('test-read-write.txt', {flags:'a'});

stream.on('open', function() {
    console.log('Stream opened, will start writing in 2 secs');
    setInterval(function() { stream.write((++lineno)+' oi!\n'); }, 2000);
});

test-fs-reader.js :[小心,这是只是演示,检查错误的对象!]

test-fs-reader.js: [Take care, this is just demonstration, check err objects!]

var fs = require('fs'),
    bite_size = 256,
    readbytes = 0,
    file;

fs.open('test-read-write.txt', 'r', function(err, fd) { file = fd; readsome(); });

function readsome() {
    var stats = fs.fstatSync(file); // yes sometimes async does not make sense!
    if(stats.size<readbytes+1) {
        console.log('Hehe I am much faster than your writer..! I will sleep for a while, I deserve it!');
        setTimeout(readsome, 3000);
    }
    else {
        fs.read(file, new Buffer(bite_size), 0, bite_size, readbytes, processsome);
    }
}

function processsome(err, bytecount, buff) {
    console.log('Read', bytecount, 'and will process it now.');

    // Here we will process our incoming data:
        // Do whatever you need. Just be careful about not using beyond the bytecount in buff.
        console.log(buff.toString('utf-8', 0, bytecount));

    // So we continue reading from where we left:
    readbytes+=bytecount;
    process.nextTick(readsome);
}

您可以安全地避免使用 nextTick 并直接调用 readsome()。由于我们仍然在这里工作同步,因此在任何意义上都没有必要。我喜欢它。 :p

You can safely avoid using nextTick and call readsome() directly instead. Since we are still working sync here, it is not necessary in any sense. I just like it. :p

通过 Oliver Lloyd 编辑

EDIT by Oliver Lloyd

以上面的例子,但将其扩展为读取CSV数据,得出:

Taking the example above but extending it to read CSV data gives:

var lastLineFeed,
    lineArray;
function processsome(err, bytecount, buff) {
    lastLineFeed = buff.toString('utf-8', 0, bytecount).lastIndexOf('\n');

    if(lastLineFeed > -1){

        // Split the buffer by line
        lineArray = buff.toString('utf-8', 0, bytecount).slice(0,lastLineFeed).split('\n');

        // Then split each line by comma
        for(i=0;i<lineArray.length;i++){
            // Add read rows to an array for use elsewhere
            valueArray.push(lineArray[i].split(','));
        }   

        // Set a new position to read from
        readbytes+=lastLineFeed+1;
    } else {
        // No complete lines were read
        readbytes+=bytecount;
    }
    process.nextTick(readFile);
}

这篇关于使用Node.js实时读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆