在不一次将整个文件加载到内存中的情况下,以块的形式读取大文件的最有效方法是什么? [英] What is the most efficient way to read a large file in chunks without loading the entire file in memory at once?

查看:16
本文介绍了在不一次将整个文件加载到内存中的情况下,以块的形式读取大文件的最有效方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在不进入不安全领域的情况下,读取大"文件(可能是文本文件或二进制文件)的最有效的通用方法是什么?当我在网上搜索rust read large file in chunks"时,我很惊讶相关结果竟然如此之少.

What is the most efficient general purpose way of reading "large" files (which may be text or binary), without going into unsafe territory? I was surprised how few relevant results there were when I did a web search for "rust read large file in chunks".

例如,我的用例之一是使用 rust-crypto 计算文件的 MD5 校验和(Md5 模块允许您添加 &[u8] 块迭代).

For example, one of my use cases is to calculate an MD5 checksum for a file using rust-crypto (the Md5 module allows you to add &[u8] chunks iteratively).

这是我所拥有的,它似乎比其他一些方法(如 read_to_end)的性能略好:

Here is what I have, which seems to perform slightly better than some other methods like read_to_end:

use std::{
    fs::File,
    io::{self, BufRead, BufReader},
};

fn main() -> io::Result<()> {
    const CAP: usize = 1024 * 128;
    let file = File::open("my.file")?;
    let mut reader = BufReader::with_capacity(CAP, file);

    loop {
        let length = {
            let buffer = reader.fill_buf()?;
            // do stuff with buffer here
            buffer.len()
        };
        if length == 0 {
            break;
        }
        reader.consume(length);
    }

    Ok(())
}

推荐答案

我认为没有比这更高效的代码了.fill_bufBufReader 上的 File基本上只是直接调用read(2).

I don't think you can write code more efficient than that. fill_buf on a BufReader over a File is basically just a straight call to read(2).

也就是说,当您这样使用 BufReader 时,它并不是真正有用的抽象;直接调用 file.read(&mut buf) 可能不那么尴尬.

That said, BufReader isn't really a useful abstraction when you use it like that; it would probably be less awkward to just call file.read(&mut buf) directly.

这篇关于在不一次将整个文件加载到内存中的情况下,以块的形式读取大文件的最有效方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆