在 Rust 中逐个字符地读取文件 [英] Read file character-by-character in Rust

查看:27
本文介绍了在 Rust 中逐个字符地读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Rust 中,是否有一种惯用的方式来一次处理一个文件?

Is there an idiomatic way to process a file one character at a time in Rust?

这似乎是我所追求的:

let mut f = io::BufReader::new(try!(fs::File::open("input.txt")));

for c in f.chars() {
    println!("Character: {}", c.unwrap());
}

但是Read::chars 从 Rust v1.6.0 开始仍然不稳定.

But Read::chars is still unstable as of Rust v1.6.0.

我考虑过使用 Read::read_to_string,但文件可能很大,我不想将其全部读入内存.

I considered using Read::read_to_string, but the file may be large and I don't want to read it all into memory.

推荐答案

让我们比较 4 种方法.

Let's compare 4 approaches.

1.读取::字符

你可以复制阅读::chars 实现,但它被标记为不稳定的

You could copy Read::chars implementation, but it is marked unstable with

错误发生位置的部分读/写的语义目前尚不清楚,可能会发生变化

the semantics of a partial read/write of where errors happen is currently unclear and may change

所以必须小心.无论如何,这似乎是最好的方法.

so some care must be taken. Anyway, this seems to be the best approach.

2.flat_map

flat_map 替代方案无法编译:

use std::io::{BufRead, BufReader};
use std::fs::File;

pub fn main() {
    let mut f = BufReader::new(File::open("input.txt").expect("open failed"));

    for c in f.lines().flat_map(|l| l.expect("lines failed").chars()) {
        println!("Character: {}", c);
    }
}

问题是 chars 从字符串中借用,但 l.expect("lines failed") 只存在于闭包内,所以编译器给出了错误 借来的价值不够长.

The problems is that chars borrows from the string, but l.expect("lines failed") lives only inside the closure, so compiler gives the error borrowed value does not live long enough.

3.嵌套

此代码

use std::io::{BufRead, BufReader};
use std::fs::File;

pub fn main() {
    let mut f = BufReader::new(File::open("input.txt").expect("open failed"));

    for line in f.lines() {
        for c in line.expect("lines failed").chars() {
            println!("Character: {}", c);
        }
    }
}

有效,但它会为每一行分配一个字符串.此外,如果输入文件没有换行符,整个文件将被加载到内存中.

works, but it keeps allocation a string for each line. Besides, if there is no line break on the input file, the whole file would be load to the memory.

4.BufRead::read_until

方法 3 的内存高效替代方法是使用 Read::read_until,并使用单个字符串读取每一行:

A memory efficient alternative to approach 3 is to use Read::read_until, and use a single string to read each line:

use std::io::{BufRead, BufReader};
use std::fs::File;

pub fn main() {
    let mut f = BufReader::new(File::open("input.txt").expect("open failed"));

    let mut buf = Vec::<u8>::new();
    while f.read_until(b'
', &mut buf).expect("read_until failed") != 0 {
        // this moves the ownership of the read data to s
        // there is no allocation
        let s = String::from_utf8(buf).expect("from_utf8 failed");
        for c in s.chars() {
            println!("Character: {}", c);
        }
        // this returns the ownership of the read data to buf
        // there is no allocation
        buf = s.into_bytes();
        buf.clear();
    }
}

这篇关于在 Rust 中逐个字符地读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆