大文件上的filereader api [英] filereader api on big files

查看:196
本文介绍了大文件上的filereader api的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的文件阅读器API代码目前一直运行良好,直到有一天我从我的客户端获得了280MB的txt文件。页面在Chrome和Firefox中直接崩溃没有任何反应。

My file reader api code has been working good so far until one day I got a 280MB txt file from one of my client. Page just crashes straight up in Chrome and in Firefox nothing happens.

// create new reader object 
var fileReader = new FileReader(); 

// read the file as text 
fileReader.readAsText( $files[i] );  
fileReader.onload = function(e) 
{   // read all the information about the file 
    // do sanity checks here etc... 
    $timeout( function() 
    {    
        // var fileContent = e.target.result;
        // get the first line 
        var firstLine = e.target.result.slice(0, e.target.result.indexOf("\n") ); }}

我在做什么上面是获取第一个换行符,以便我可以获取文件的列长度。我不应该将其作为文本读取吗?如何在不破坏大文件页面的情况下获取文件的列长度?

What I am trying to do above is that get the first line break so that I can get the column length of the file. Should I not read it as text ? How can I get the column length of the file without breaking the page on big files?

推荐答案

您的应用程序无法处理大文件,因为您在处理文件之前将完整文件读入内存。文件(读取小块文件),所以你只需要在内存中保存一部分文件。

Your application is failing for big files because you're reading the full file into memory before processing it. This inefficiency can be solved by streaming the file (reading chunks of a small size), so you only need to hold a part of the file in memory.

A 文件对象也是一个实例一个 Blob ,它提供了 .slice 方法来创建文件的小视图。

A File objects is also an instance of a Blob, which offers the .slice method to create a smaller view of the file.

以下是一个假设输入为ASCII的示例(演示: http://jsfiddle.net/mw99v8d4/

Here is an example that assumes that the input is ASCII (demo: http://jsfiddle.net/mw99v8d4/).

function findColumnLength(file, callback) {
    // 1 KB at a time, because we expect that the column will probably small.
    var CHUNK_SIZE = 1024;
    var offset = 0;
    var fr = new FileReader();
    fr.onload = function() {
        var view = new Uint8Array(fr.result);
        for (var i = 0; i < view.length; ++i) {
            if (view[i] === 10 || view[i] === 13) {
                // \n = 10 and \r = 13
                // column length = offset + position of \r or \n
                callback(offset + i);
                return;
            }
        }
        // \r or \n not found, continue seeking.
        offset += CHUNK_SIZE;
        seek();
    };
    fr.onerror = function() {
        // Cannot read file... Do something, e.g. assume column size = 0.
        callback(0);
    };
    seek();

    function seek() {
        if (offset >= file.size) {
            // No \r or \n found. The column size is equal to the full
            // file size
            callback(file.size);
            return;
        }
        var slice = file.slice(offset, offset + CHUNK_SIZE);
        fr.readAsArrayBuffer(slice);
    }
}

上一个片段计算一行前的字节数打破。计算由多字节字符组成的文本中字符的数量要稍微困难一些,因为您必须考虑块中最后一个字节可能是多字节字符的一部分的可能性。

The previous snippet counts the number of bytes before a line break. Counting the number of characters in a text consisting of multibyte characters is slightly more difficult, because you have to account for the possibility that the last byte in the chunk could be a part of a multibyte character.

这篇关于大文件上的filereader api的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆