逐行读取FileReader对象,而不将整个文件加载到RAM中 [英] Read FileReader object line-by-line without loading the whole file into RAM

查看:161
本文介绍了逐行读取FileReader对象,而不将整个文件加载到RAM中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在许多浏览器都支持使用HTML5的FileReader读取本地文件,这为超出数据库前端的网站打开了大门,这些脚本可以对本地数据执行一些有用的操作,而无需先将其发送到服务器。

Now that many browsers support reading local files with HTML5's FileReader, this opens the door to websites which go beyond 'database front-ends' into scripts which can do something useful with local data without having to send it up to a server first.

在上传之前预处理图像和视频,FileReader的一个大应用是从某种磁盘表(CSV,TSV,无论什么)加载数据进入浏览器进行操作 - 可能用于在D3.js中进行绘图或分析,或者在WebGL中创建格局。

Pre-processing images and video before upload aside, one big application of FileReader would be loading data from some kind of on-disk table (CSV, TSV, whatever) into the browser for manipulation - perhaps for plotting or analysis in D3.js or creating landscapes in WebGL.

问题是,StackOverflow和其他网站上的大多数示例都使用FileReader的.readAsText()属性,在返回结果之前将整个文件读入RAM。

Problem is, most examples out there on StackOverflow and other sites use FileReader's .readAsText() property, which reads the whole file into RAM before returning a result.

javascript:如何逐行解析FileReader对象

要在不将数据加载到RAM的情况下读取文件,需要使用.readAsArrayBuffer( ),和我的帖子是最接近我的答案:

To read a file without loading the data into RAM, one would need to use .readAsArrayBuffer(), and this SO post is the closest I can get to a good answer:

filereader api on大文件

然而,对于那个特定问题来说,它有点过于具体,而且说实话,我可以尝试好几天来使解决方案更加通用,因为我不理解块大小的重要性或为什么使用Uint8Array而空手而归。使用用户可定义的行分隔符逐行读取文件的一般问题的解决方案(理想情况下使用.split(),因为它也接受正则表达式),然后按行执行某些操作(例如将其打印到console.log)是理想的。

However, it's a bit too specific to that particular problem, and in all honesty, I could try for days to make the solution more general, and come out empty handed because I didn't understand the significance of the chunk sizes or why Uint8Array is used. A solution to the more general problem of reading a file line-by-line using a user-definable line separator (ideally with .split() since that also accept regex), and then doing something per-line (like printing it to the console.log) would be ideal.

推荐答案

我在下面的Gist URL上创建了一个LineReader类。正如我在评论中提到的,使用除LF,CR / LF和CR之外的其他行分隔符是不常见的。因此,我的代码只将LF和CR / LF视为行分隔符。

I've made a LineReader class at the following Gist URL. As I mentioned in a comment, it's unusual to use other line separators than LF, CR/LF and maybe CR. Thus, my code only considers LF and CR/LF as line separators.

https://gist.github.com/peteroupc/b79a42fffe07c2a87c28

示例:

new LineReader(file).readLines(function(line){
 console.log(line);
});

这篇关于逐行读取FileReader对象,而不将整个文件加载到RAM中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆