使用通用迭代器而不是特定的列表类型 [英] Using generic iterators instead of specific list types

查看:32
本文介绍了使用通用迭代器而不是特定的列表类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 Rust 非常陌生,来自 C#/Java/类似.

I'm very new to Rust, coming from C# / Java / similar.

在 C# 中,我们有 IEnumerable 可用于迭代几乎任何类型的数组或列表.C# 还有一个 yield 关键字,可用于返回惰性列表.这是一个例子...

In C# we have IEnumerable<T> that can be used to iterate almost any kind of array or list. C# also has a yield keyword that you can use to return a lazy list. Here's an example...

// Lazily returns the even numbers out of an enumerable
IEnumerable<int> Evens(IEnumerable<int> input)
{
    foreach (var x in input)
    {
        if (x % 2 == 0)
        {
            yield return x;
        }
    }
}

这当然是一个愚蠢的例子.我知道我可以用 Rust 的 map 函数做到这一点,但我想知道如何创建我自己的方法来接受和返回通用迭代器.

This is a silly example of course. I know I could do this with Rust's map function, but I would like to know how to create my own methods that accept and return generic iterators.

据我所知,Rust 具有可以类似使用的通用迭代器,但它们超出了我的理解.我在文档中看到了 IterIntoIteratorIterator 类型,可能还有更多类型,但没有很好的方法来理解它们.

From what I can gather, Rust has generic iterators that can be use similarly, but they are above my understanding. I see Iter, IntoIterator, Iterator types, and probably more in documentation, but no good way to understand them.

谁能提供清晰的示例来说明如何创建上述内容?谢谢!

Can anyone provide clear examples of how to create something like above? Thank you!

附言懒惰方面是可选的.我更关心远离特定列表和数组类型的抽象.

P.S. The lazy aspect is optional. I am more concerned with abstraction away from specific list and array types.

推荐答案

首先,忘记 IntoIterator 和其他特征或类型.Rust 的核心迭代特征是 Iterator.其精简定义如下:

First, forget about IntoIterator and other traits or types. The core iteration trait in Rust is Iterator. Its trimmed down definition is as follows:

trait Iterator {
    type Item;  // type of elements returned by the iterator
    fn next(&mut self) -> Option<Self::Item>;
}

您可能知道,您可以将迭代器视为某种结构内的游标.next() 方法向前移动这个光标,返回它之前指向的一个元素.自然,如果集合耗尽了,就没有什么可以返回了,所以next()返回的是Option,而不仅仅是Self::项目.

As you probably know, you can think of an iterator as a cursor inside of some structure. next() method advances this cursor forward, returning an element it pointed at previously. Naturally, if the collection is exhausted, there is nothing to return, and so next() returns Option<Self::Item>, not just Self::Item.

Iterator 是一个 trait,因此它可以由特定类型实现.请注意,Iterator 本身 不是可以用作返回值或函数参数的正确类型 - 您必须使用 具体 类型实现这个特性.

Iterator is a trait, and so it can be implemented by specific types. Note that Iterator itself is not a proper type which you can use as a return value or a function argument - you have to use concrete types which implement this trait.

上面的语句可能听起来过于严格——那么如何使用任意迭代器类型呢?- 但由于泛型,情况并非如此.如果你想让一个函数接受任意迭代器,只需在相应的参数中使其泛型,在相应的类型参数上添加一个 Iterator 绑定:

The above statement may sound too restrictive - how to use arbitrary iterator types then? - but because of generics this is not so. If you want a function to accept arbitrary iterators, just make it generic in the corresponding argument, adding an Iterator bound over the corresponding type parameter:

fn iterate_bytes<I>(iter: I) where I: Iterator<Item=u8> { ... }

从函数返回迭代器可能很困难,但请参见下文.

Returning iterators from functions may be difficult, but see below.

例如,&[T] 上有一个方法,称为 iter(),它返回一个迭代器,该迭代器产生对切片的引用.这个迭代器是 this 结构的一个实例.您可以在该页面上看到如何为 Iter 实现 Iterator:

For example, there is a method on &[T], called iter(), which returns an iterator which yields references into the slice. This iterator is an instance of this structure. You can see on that page how Iterator is implemented for Iter:

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a T;
    fn next(&mut self) -> Option<&'a T> { ... }
    ...
}

这个结构包含对原始切片的引用和其中的一些迭代状态.它的 next() 方法更新此状态并返回下一个值(如果有).

This structure holds a reference to the original slice and some iteration state inside it. Its next() method updates this state and returns the next value, if there is any.

任何类型实现了 Iterator 的值都可以在 for 循环中使用(for 循环实际上与 IntoIterator,但见下文):

Any value whose type implements Iterator can be used in a for loop (for loop in fact works with IntoIterator, but see below):

let s: &[u8] = b"hello";
for b in s.iter() {
    println!("{}", b);   // prints numerical value of each byte
}

现在,Iterator trait 实际上比上面的更复杂.它还定义了许多转换方法,这些方法使用调用它们的迭代器并返回一个新的迭代器,该迭代器以某种方式转换或过滤来自原始迭代器的值.例如,enumerate() 方法返回一个迭代器,该迭代器从原始迭代器中产生值以及元素的位置编号:

Now, Iterator trait is actually more complex than the above one. It also defines a lot of transformation methods which consume the iterator they are called on and return a new iterator which somehow transforms or filters values from the original iterator. For example, enumerate() method returns an iterator which yields values from the original iterator together with the positional number of the element:

let s: &[u8] = b"hello";
for (i, b) in s.iter().enumerate() {
    println!("{} at {}", b, i);   // prints "x at 0", "y at 1", etc.
}

enumerate() 定义如下:

trait Iterator {
    type Item;
    ...
    fn enumerate(self) -> Enumerate<Self> {
        Enumerate {
            iter: self,
            count: 0
        }
    }
    ...
}

Enumerate 只是一个结构体,其中包含一个迭代器和一个计数器,并实现了 Iterator:

Enumerate is just a struct which contains an iterator and a counter inside it and which implements Iterator<Item=(usize, I::Item)>:

struct Enumerate<I> {
    iter: I,
    count: usize
}

impl<I> Iterator for Enumerate<I> where I: Iterator {
    type Item = (usize, I::Item);

    #[inline]
    fn next(&mut self) -> Option<(usize, I::Item)> {
        self.iter.next().map(|a| {
            let ret = (self.count, a);
            self.count += 1;
            ret
        })
    }
}

并且这个是大多数迭代器转换的实现方式:每个转换都是一个包装结构,它包装原始迭代器并通过委托给原始迭代器并转换来实现Iterator特征结果值不知何故.例如,上述示例中的 s.iter().enumerate() 返回 Enumerate> 类型的值.

And this is how most iterator transformations are implemented: each transformation is a wrapping struct which wraps the original iterator and implements Iterator trait by delegating to the original iterator and transforming the resulting value somehow. For example, s.iter().enumerate() from the example above returns a value of type Enumerate<Iter<'static, u8>>.

请注意,虽然 enumerate() 是直接在 Iterator trait 中定义的,但它也可以是一个独立的函数:

Note that while enumerate() is defined in Iterator trait directly, it can be a standalone function as well:

fn enumerate<I>(iter: I) -> Enumerate<I> where I: Iterator {
    Enumerate {
        iter: iter,
        count: 0
    }
}

该方法的工作原理非常相似 - 它只是使用隐式 Self 类型参数而不是显式命名的参数.

The method works very similarly - it just uses implicit Self type parameter instead of an explicitly named one.

您可能想知道 IntoIterator 特性是什么.好吧,它只是一个方便的转换特性,可以被任何可以转换为迭代器的类型实现:

You may wonder what IntoIterator trait is. Well, it is just a convenience conversion trait which can be implemented by any type which can be converted to an iterator:

pub trait IntoIterator where Self::IntoIter::Item == Self::Item {
    type Item;
    type IntoIter: Iterator;

    fn into_iter(self) -> Self::IntoIter;
}

例如,&'a [T]可以转换成Iter<'a, T>,所以它有如下实现:

For example, &'a [T] can be converted into Iter<'a, T>, and so it has the following implementation:

impl<'a, T> IntoIterator for &'a [T] {
    type Item = &'a T;
    type IntoIter = Iter<'a, T>;

    fn into_iter(self) -> Iter<'a, T> {
        self.iter()  // just delegate to the existing method
    }
}

大多数容器类型和对这些类型的引用都实现了这个特征.它实际上被 for 循环使用 - 实现 IntoIterator 的任何类型的值都可以在 in 子句中使用:

This trait is implemented for most container types and references to these types. It is in fact used by for loops - a value of any type which implements IntoIterator can be used in in clause:

let s: &[u8] = b"hello";
for b in s { ... }

从学习和阅读的角度来看,这是非常好的,因为它具有更少的噪音(以 iter() 类方法的形式).它甚至允许这样的事情:

This is very nice from learning and reading perspective because it has less noise (in form of iter()-like methods). It even allows things like these:

let v: Vec<u8> = ...;

for i in &v { /* i is &u8 here, v is borrowed immutably */ }
for i in &mut v { /* i is &mut u8 here, v is borrowed mutably */ }
for i in v { /* i is just u8 here, v is consumed */ }

这是可能的,因为 IntoIterator 对于 &Vec&mut Vec 和仅 Vec.

This is possible because IntoIterator is implemented differently for &Vec<T>, &mut Vec<T> and just Vec<T>.

每个 Iterator 都实现了 IntoIterator,它执行身份转换(into_iter() 只返回它被调用的迭代器),所以你可以在 for 循环中也使用 Iterator 实例.

Every Iterator implements IntoIterator which performs an identity conversion (into_iter() just returns the iterator it is called on), so you can use Iterator instances in for loops as well.

因此,在泛型函数中使用 IntoIterator 是有意义的,因为它会使 API 对用户更方便.例如,上面的 enumerate() 函数可以重写为:

Consequently, it makes sense to use IntoIterator in generic functions because it will make the API more convenient for the user. For example, enumerate() function from above could be rewritten as such:

fn enumerate<I>(source: I) -> Enumerate<I::IntoIter> where I: IntoIter {
    Enumerate {
        iter: source.into_iter(),
        count: 0
    }
}

<小时>

现在您可以看到如何使用泛型轻松实现静态类型转换.Rust 没有 C#/Python yield 之类的东西(但它是最需要的特性之一,所以有一天它可能会出现在语言中!),因此您需要显式地包装源迭代器.例如,您可以编写类似于上述 Enumerate 结构的内容来完成您想要的任务.


Now you can see how generics can be used to implement transformations with static typing easily. Rust does not have anything like C#/Python yield (but it is one of the most desired features, so one day it may appear in the language!), thus you need to wrap source iterators explicitly. For example, you can write something analogous to the above Enumerate structure which does the task you want.

然而,最惯用的方法是使用现有的组合器为您完成工作.比如你的代码可能写成这样:

However, the most idiomatic way would be to use existing combinators to do the work for you. For example, your code may be written as follows:

let iter = ...;  // iter implements Iterator<Item=i32>
let r = iter.filter(|&x| x % 2 == 0);  // r implements Iterator<Item=i32>
for i in r {
    println!("{}", i);  // prints only even items from the iterator
}

然而,当你想编写自定义组合器函数时,使用组合器可能会变得丑陋,因为很多现有的组合器函数都接受闭包(例如上面的 filter() ),但在 Rust 中实现了闭包作为匿名类型的值,所以没有办法写出返回迭代器的函数的签名:

However, using combinators may turn ugly when you want to write custom combinator functions because a lot of existing combinator functions accept closures (e.g. the filter() one above), but closures in Rust are implemented as values of anonymous types, so there is just no way to write the signature of the function returning the iterator out:

fn filter_even<I>(source: I) -> ??? where I: IntoIter<Item=i32> {
    source.into_iter().filter(|&x| x % 2 == 0)
}

有几种方法可以解决这个问题,其中之一是使用 trait 对象:

There are several ways around this, one of them is using trait objects:

fn filter_even<'a, I>(source: I) -> Box<Iterator<Item=i32>+'a>
    where I: IntoIterator<Item=i32>, I::IntoIter: 'a
{
    Box::new(source.into_iter().filter(|&x| x % 2 == 0))
}

这里我们将 filter() 返回的实际迭代器类型隐藏在 trait 对象后面.请注意,为了使函数完全通用,我必须添加一个生命周期参数和一个对应的 Box 特征对象和 I::IntoIter 关联类型的绑定.这是必要的,因为 I::IntoIter 可能在其中包含任意生命周期(就像上面的 Iter<'a, T> 类型),我们必须在trait 对象类型(否则生命周期信息会丢失).

Here we hide the actual iterator type returned by filter() behind a trait object. Note that in order to make the function fully generic I had to add a lifetime parameter and a corresponding bound to Box trait object and I::IntoIter associated type. This is necessary because I::IntoIter may contain arbitrary lifetimes inside it (just like Iter<'a, T> type above), and we have to specify them in the trait object type (otherwise the lifetime information would be lost).

Iterator trait 创建的 Trait 对象自己实现了 Iterator,所以你可以像往常一样继续使用这些迭代器:

Trait objects created from Iterator trait implement Iterator themselves, so you can continue using these iterators as usual:

let source = vec![1_i32, 2, 3, 4];
for i in filter_even(source) {
    println!("{}", i);  // prints 2 and 4
}

这篇关于使用通用迭代器而不是特定的列表类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆