在不一次将整个文件立即加载到内存的情况下,读取大文件的最有效方法是什么? [英] What is the most efficient way to read a large file in chunks without loading the entire file in memory at once?
问题描述
在不进入unsafe
领域的情况下,读取大"文件(可能是文本或二进制文件)的最有效的通用方法是什么?当我在网络上搜索生锈读取大块文件"时,我几乎没有相关结果.
What is the most efficient general purpose way of reading "large" files (which may be text or binary), without going into unsafe
territory? I was surprised how few relevant results there were when I did a web search for "rust read large file in chunks".
例如,我的一个用例之一是使用rust-crypto
(Md5
模块允许您迭代地添加&[u8]
块)来计算文件的MD5校验和.
For example, one of my use cases is to calculate an MD5 checksum for a file using rust-crypto
(the Md5
module allows you to add &[u8]
chunks iteratively).
这就是我所拥有的,它似乎比read_to_end
等其他方法的性能稍好:
Here is what I have, which seems to perform slightly better than some other methods like read_to_end
:
use std::{
fs::File,
io::{self, BufRead, BufReader},
};
fn main() -> io::Result<()> {
const CAP: usize = 1024 * 128;
let file = File::open("my.file")?;
let mut reader = BufReader::with_capacity(CAP, file);
loop {
let length = {
let buffer = reader.fill_buf()?;
// do stuff with buffer here
buffer.len()
};
if length == 0 {
break;
}
reader.consume(length);
}
Ok(())
}
推荐答案
我认为您没有比这更高效的代码了.在File
上的BufReader
上的fill_buf
是
I don't think you can write code more efficient than that. fill_buf
on a BufReader
over a File
is basically just a straight call to read(2)
.
也就是说,当您像这样使用BufReader
时,它并不是真正有用的抽象.直接调用file.read(&mut buf)
可能不太麻烦.
That said, BufReader
isn't really a useful abstraction when you use it like that; it would probably be less awkward to just call file.read(&mut buf)
directly.
这篇关于在不一次将整个文件立即加载到内存的情况下,读取大文件的最有效方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!