在 Rust 中多次使用相同的迭代器 [英] Using the same iterator multiple times in Rust

查看:48
本文介绍了在 Rust 中多次使用相同的迭代器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编者注:此代码示例来自 Rust 1.0 之前的版本,当时许多迭代器实现了Copy.此代码的更新版本会产生不同的错误,但答案仍包含有价值的信息.

Editor's note: This code example is from a version of Rust prior to 1.0 when many iterators implemented Copy. Updated versions of this code produce a different errors, but the answers still contain valuable information.

我正在尝试编写一个函数来将一个字符串分成几组字母和数字;例如,"test123test" 会变成 [ "test", "123", "test" ].到目前为止,这是我的尝试:

I'm trying to write a function to split a string into clumps of letters and numbers; for example, "test123test" would turn into [ "test", "123", "test" ]. Here's my attempt so far:

pub fn split(input: &str) -> Vec<String> {
    let mut bits: Vec<String> = vec![];
    let mut iter = input.chars().peekable();
    loop {
        match iter.peek() {
            None => return bits,
            Some(c) => if c.is_digit() {
                bits.push(iter.take_while(|c| c.is_digit()).collect());
            } else {
                bits.push(iter.take_while(|c| !c.is_digit()).collect());
            }
        }
    }
    return bits;
}

然而,这不起作用,永远循环.每次我调用 take_while 时,它似乎都在使用 iter 的克隆,从同一个位置一遍又一遍地开始.我希望它每次都使用相同的 iter,在所有 each_time 上推进相同的迭代器.这可能吗?

However, this doesn't work, looping forever. It seems that it is using a clone of iter each time I call take_while, starting from the same position over and over again. I would like it to use the same iter each time, advancing the same iterator over all the each_times. Is this possible?

推荐答案

如您所见,每个 take_while 调用都是重复的 iter,因为 take_while> 采用 self 并且 Peekable 字符迭代器是 复制.(仅在 Rust 1.0 之前为 true — 编辑器)

As you identified, each take_while call is duplicating iter, since take_while takes self and the Peekable chars iterator is Copy. (Only true before Rust 1.0 — editor)

您希望每次都修改迭代器,也就是说,让 take_while 对迭代器的 &mut 进行操作.这正是 .by_ref 适配器用于:

You want to be modifying the iterator each time, that is, for take_while to be operating on an &mut to your iterator. Which is exactly what the .by_ref adaptor is for:

pub fn split(input: &str) -> Vec<String> {
    let mut bits: Vec<String> = vec![];
    let mut iter = input.chars().peekable();
    loop {
        match iter.peek().map(|c| *c) {
            None => return bits,
            Some(c) => if c.is_digit(10) {
                bits.push(iter.by_ref().take_while(|c| c.is_digit(10)).collect());
            } else {
                bits.push(iter.by_ref().take_while(|c| !c.is_digit(10)).collect());
            },
        }
    }
}

fn main() {
    println!("{:?}", split("123abc456def"))
}

印刷品

["123", "bc", "56", "ef"]

然而,我认为这是不正确的.

However, I imagine this is not correct.

我实际上建议使用 char_indices 迭代器:

I would actually recommend writing this as a normal for loop, using the char_indices iterator:

pub fn split(input: &str) -> Vec<String> {
    let mut bits: Vec<String> = vec![];
    if input.is_empty() {
        return bits;
    }

    let mut is_digit = input.chars().next().unwrap().is_digit(10);
    let mut start = 0;

    for (i, c) in input.char_indices() {
        let this_is_digit = c.is_digit(10);
        if is_digit != this_is_digit {
            bits.push(input[start..i].to_string());
            is_digit = this_is_digit;
            start = i;
        }
    }

    bits.push(input[start..].to_string());
    bits
}

这种形式还允许以更少的分配来做到这一点(也就是说,String 不是必需的),因为每个返回值只是 input,我们可以使用生命周期来说明这一点:

This form also allows for doing this with much fewer allocations (that is, the Strings are not required), because each returned value is just a slice into the input, and we can use lifetimes to state this:

pub fn split<'a>(input: &'a str) -> Vec<&'a str> {
    let mut bits = vec![];
    if input.is_empty() {
        return bits;
    }

    let mut is_digit = input.chars().next().unwrap().is_digit(10);
    let mut start = 0;

    for (i, c) in input.char_indices() {
        let this_is_digit = c.is_digit(10);
        if is_digit != this_is_digit {
            bits.push(&input[start..i]);
            is_digit = this_is_digit;
            start = i;
        }
    }

    bits.push(&input[start..]);
    bits
}

所有改变的是类型签名,删除了 Vec 类型提示和 .to_string 调用.

All that changed was the type signature, removing the Vec<String> type hint and the .to_string calls.

甚至可以编写这样的迭代器,以避免必须分配Vec.类似于 fn split<'a>(input: &'a str) ->拆分<'a>{/* 构造一个 Splits */} 其中 Splits 是一个实现 Iterator<&'a str> 的结构体.

One could even write an iterator like this, to avoid having to allocate the Vec. Something like fn split<'a>(input: &'a str) -> Splits<'a> { /* construct a Splits */ } where Splits is a struct that implements Iterator<&'a str>.

这篇关于在 Rust 中多次使用相同的迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆