FileReader:使用javascript读取许多文件而不会发生内存泄漏 [英] FileReader: reading many files with javascript without memory leaks
问题描述
在网页中,我必须读取文件的一小部分,对于许多(1500-12000)小文件来说,每个文件的大小约为1 Mb.收集完所需的信息后,我会将其推回服务器.
In a web page, I have to read a small part of a file, this for many (1500 - 12000) small files each being approx 1 Mb big. Once I collected the information I need, I push it back to the server.
我的问题:我使用FileReader API,垃圾收集不起作用,并且内存消耗激增.
My problem: I use the FileReader API, garbage collect does not work and memory consumption explodes.
代码如下:
function extract_information_from_files(input_files) {
//some dummy implementation
for (var i = 0; i < input_files.length; ++i) {
(function dummy_function(file) {
var reader = new FileReader();
reader.onload = function () {
//convert to Uint8Array because used library expects this
var array_buffer = new Uint8Array(reader.result);
//do some fancy stuff with the library (very small subset of data is kept)
//finish
//function call ends, expect garbage collect to start cleaning.
//even explicit dereferencing does not work
};
reader.readAsArrayBuffer(file);
})(input_files[i]);
}
}
一些评论:
- 不,乍一看,该库似乎未保留对已加载对象的任何引用.即使运行上面显示的代码,而根本不使用 array_buffer ,所有内容都将保存在内存中.
- 行为因浏览器而异:
- Chrome(43)不会清除所有内容
- Firefox(38)似乎使用的剩余内存量约为所有文件大小的1/3
- 我发现很少有讨论互联网上相同问题的主题.我尝试过的是:
- 可以在FileReader之后清理内存吗?->旧的File.prototype.mozSlice已更改为.slice,但即使如此,问题仍然存在
- http://www.joelandritsch.com/posts/lessons-learned-in-javascript-11 ->建议的解决方案不起作用.
- https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Memory_Management 对我来说不是很清楚.->似乎首先您不需要取消引用(请参阅不需要对象与无法访问对象),然后它们还声明限制:需要使对象显式不可访问"
- No, at first sight, the library does not seem to keep any references to the loaded objects. Even if you run the code as it is shown above, with array_buffer not used at all, everything is being kept into memory.
- The behavior is varies according to browser:
- Chrome (43) does not clear anything all
- Firefox (38) seems to use a residual memory usage of about 1/3 of the size of all files
- I found very, very few topics discussing the same issues on Internet. The ones I tried, were:
- Is it possible to clean memory after FileReader? -> Old, File.prototype.mozSlice has changed to .slice, but even then the problem remains
- http://www.joelandritsch.com/posts/lessons-learned-in-javascript-11 -> proposed solution does not work.
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management is not very clear to me. -> It seems like first that you do not need to de-reference (see object is not needed vs. object is not reachable) but then they also state 'Limitation: objects need to be made explicitly unreachable'
当FileReader与 https://结合使用时,最后一个奇怪的细节(为完整性而发布)gildas-lormeau.github.io/zip.js/,在将文件推送到zip归档文件之前,我在其中读取了文件,垃圾收集才行得通.
Last strange detail (posted for completeness), when using FileReader combined with https://gildas-lormeau.github.io/zip.js/, where I read a File just before pushing it to a zip archive, garbage collecting just works.
所有这些评论似乎都指向我无法使用FileReader,因此请告诉我如何使用.
All these remarks, seem to point towards me being unable to use FileReader as it should, so please tell me how.
推荐答案
问题可能与执行顺序有关.在您的 for
循环中,您正在使用 reader.readAsArrayBuffer(file)
读取所有文件.此代码将在为读取器运行任何 onload
之前运行.取决于 FileReader
的浏览器实现,这可能意味着浏览器会在调用任何 onload
之前加载整个文件(或简单地为整个文件预分配缓冲区).
The problem may be related with the order of execution. In your for
loop you are reading all files with reader.readAsArrayBuffer(file)
. This code will run before any onload
is run for a reader. Depending on the browser implementation of FileReader
this can mean the browser loads the entire file (or simply preallocates the buffer for the entire file) before any onload
is called.
尝试像处理队列一样处理文件,看看是否有所作为.像这样:
Try to process files like a queue and see if it makes a difference. Something like:
function extract_information_from_files(input_files) {
var reader = new FileReader();
function process_one() {
var single_file = input_files.pop();
if (single_file === undefined) {
return;
}
(function dummy_function(file) {
//var reader = new FileReader();
reader.onload = function () {
// do your stuff
// process next at the end
process_one();
};
reader.readAsArrayBuffer(file);
})(single_file);
}
process_one();
}
extract_information_from_files(file_array_1);
// uncomment next line to process another file array in parallel
// extract_information_from_files(file_array_2);
编辑:似乎浏览器希望您重新使用 FileReaders
.我已经编辑了代码以重复使用单个读取器,并测试(在Chrome中),内存使用量限制为您读取的最大文件.
EDIT: It seems that browsers expect you to reuse FileReaders
. I've edited the code to reuse a single reader and tested (in chrome) that the memory usage stays limited to the largest file you read.
这篇关于FileReader:使用javascript读取许多文件而不会发生内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!