为什么在使用`flat_map`时需要收集到向量中? [英] Why do I need to collect into a vector when using `flat_map`?

查看:57
本文介绍了为什么在使用`flat_map`时需要收集到向量中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 Project Euler 96 来教我自己Rust.我已经编写了这段代码以读取文件并将其转换为整数向量(

I'm working on Project Euler 96 to teach myself Rust. I've written this code to read in the file and convert it into a vector of integers (Playground).

let file = File::open(&args[1]).expect("Sudoku file not found");
let reader = BufReader::new(file);

let x = reader
    .lines()
    .map(|x| x.unwrap())
    .filter(|x| !x.starts_with("Grid"))
    .flat_map(|s| s.chars().collect::<Vec<_>>())  // <-- collect here!
    .map(|x| x.to_digit(10).unwrap())
    .collect::<Vec<_>>();

这一切都很好,但是我感到困惑,为什么我必须在我的 flat_map 中将其收集到一个向量中(我假设创建不需要的向量,将其立即销毁是没有效率的).如果我不收集,它将无法编译:

This all works fine but I'm puzzled why I have to collect into a vector in my flat_map (I'm assuming creating unneeded vectors which will be immediately destroyed is inefficient). If I don't collect, it doesn't compile:

error[E0515]: cannot return value referencing function parameter `s`
  --> src/main.rs:13:23
   |
13 |         .flat_map(|s| s.chars())
   |                       -^^^^^^^^
   |                       |
   |                       returns a value referencing data owned by the current function
   |                       `s` is borrowed here

来自文档的示例显示几乎相同的代码,但不需要收集:

The example from the docs shows almost the same code, but a collect is not required:

let words = ["alpha", "beta", "gamma"];

// chars() returns an iterator
let merged: String = words.iter()
                          .flat_map(|s| s.chars())
                          .collect();
assert_eq!(merged, "alphabetagamma");

那为什么我的代码与众不同?

So why is it different in my code?

推荐答案

迭代器 reader.lines().map(| x | x.unwrap()) String 项,即按值.因此,在 .flat_map(| s | ...)中,变量 s 的类型为 String (即拥有的,不是借用的).换句话说:字符串现在是一个局部变量,并存在于函数中.这是一条简单的规则,您不能返回对局部变量的引用(请参阅

The iterator reader.lines().map(|x| x.unwrap()) iterates over String items, i.e. by value. Consequently, in .flat_map(|s| ...), the variable s has the type String (i.e. owned, not borrowed). In other words: the string is a local variable now and lives in the function. It's a simple rule that you cannot return references to local variables (see this Q&A). But that's exactly what s.chars() does, even if it's a bit hidden.

str:上字符 :

pub fn chars(&self) -> Chars<'_>

可以看到该字符串是借来的.返回的 Chars 对象包含对原始字符串的引用.这就是为什么我们不能从闭包中返回 s.chars()的原因.

One can see that the string is borrowed. The returned Chars object contains a reference to the original string. That's why we cannot return s.chars() from the closure.

那为什么我的代码与众不同?

So why is it different in my code?

在文档的示例中,迭代器 words.iter()实际上对&&'static str 类型的项目进行迭代.调用 s.chars()还将返回一个借用一些字符串的 Chars 对象,但是该字符串的生命周期是'static (永久存在),因此从该闭包中返回 Chars 没问题.

In the documentation's example, the iterator words.iter() actually iterates over items of the type &&'static str. Calling s.chars() will also return a Chars object that borrows some string, but that string's lifetime is 'static (lives forever), so there is no problem with returning Chars from that closure.

如果标准库具有一个 OwnedChars 迭代器,该迭代器消耗一个 String ,其作用类似于 Chars ,并且一旦迭代器已删除.在这种情况下,可以调用 s.owned_chars(),因为返回的对象不引用本地 s ,而是拥有它.但是:这样的拥有的迭代器在标准库中不存在!

It would be great if the standard library had an OwnedChars iterator that consumes a String, works like Chars and drops the string once the iterator is dropped. In that case it's fine to call s.owned_chars() because the returned object does not reference the local s, but owns it. But: such an owned iterator does not exist in the standard library!

我假设创建不需要的向量并将其立即销毁是无效的

I'm assuming creating unneeded vectors which will be immediately destroyed is inefficient

是的,从某种意义上说是正确的.但是您可能会错过 reader.lines()迭代器还会创建类型为 String 的临时对象.那些或多或少立即被销毁!因此,即使在 flat_map 中没有 collect ,您也有很多不必要的分配.请注意,有时候没关系.在这种情况下,我猜想与您必须实现的实际算法相比,输入解析非常快.所以...只是收集?在这种情况下可能很好.

Yes, that is true in a way. But you might have missed that the reader.lines() iterator also creates temporary objects of type String. Those are more or less immediately destroyed as well! So even without the collect in the flat_map you have a bunch of unnecessary allocations. Note that sometimes that's Ok. In this case, I guess that input parsing is very fast in comparison to the actual algorithm you have to implement. So ... just collect? It's probably fine in this case.

如果您想要高性能的输入解析,我想您将无法避免标准循环,尤其是为了避免不必要的 String 分配.(游乐场)

If you want to have high performance input parsing, I think you will not be able to avoid a standard loop, in particular in order to avoid unnecessary String allocations. (Playground)

let mut line = String::new();
let mut input = Vec::new();
loop {
    line.clear(); // clear contents, but keep memory buffer

    // TODO: handle IO error properly
    let bytes_read = reader.read_line(&mut line).expect("IO error"); 
    if bytes_read == 0 {
        break;
    }

    if line.starts_with("Grid") {
        continue;
    }

    // TODO: handle invalid input error
    input.extend(line.trim().chars().map(|c| c.to_digit(10).unwrap()));
}

这篇关于为什么在使用`flat_map`时需要收集到向量中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆