如何从缓存任意结果的结构中删除过多的 `clone` 调用? [英] How do I remove excessive `clone` calls from a struct that caches arbitrary results?

查看:42
本文介绍了如何从缓存任意结果的结构中删除过多的 `clone` 调用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 Rust 书第二版中关于闭包的部分.在本节的最后,有一个练习来扩展之前给出的 Cacher 实现.我试了一下:

I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the Cacher implementation given before. I gave it a try:

use std::clone::Clone;
use std::cmp::Eq;
use std::collections::HashMap;
use std::hash::Hash;

struct Cacher<T, K, V>
where
    T: Fn(K) -> V,
    K: Eq + Hash + Clone,
    V: Clone,
{
    calculation: T,
    values: HashMap<K, V>,
}

impl<T, K, V> Cacher<T, K, V>
where
    T: Fn(K) -> V,
    K: Eq + Hash + Clone,
    V: Clone,
{
    fn new(calculation: T) -> Cacher<T, K, V> {
        Cacher {
            calculation,
            values: HashMap::new(),
        }
    }

    fn value(&mut self, arg: K) -> V {
        match self.values.clone().get(&arg) {
            Some(v) => v.clone(),
            None => {
                self.values
                    .insert(arg.clone(), (self.calculation)(arg.clone()));
                self.values.get(&arg).unwrap().clone()
            }
        }
    }
}

在创建了一个最终有效的版本后,我对它真的很不满意.真正让我烦恼的是 cacher.value(...) 有 5(!) 个对 clone() 的调用.有没有办法避免这种情况?

After creating a version that finally works, I am really unhappy with it. What really bugs me is that cacher.value(...) has 5(!) calls to clone() in it. Is there a way to avoid this?

推荐答案

你的怀疑是正确的,代码包含太多对 clone() 的调用,打败了优化Cacher旨在实现.

Your suspicion is correct, the code contains too many calls to clone(), defeating the very optimizations Cacher is designed to achieve.

首先调用 self.values.clone() - 它在每次访问时创建整个缓存的副本.

The one to start with is the call to self.values.clone() - it creates a copy of the entire cache on every single access.

删除此克隆.

正如您可能自己发现的那样,简单地删除 .clone() 并不能编译.这是因为借用检查器会考虑在 match 的整个持续时间内引用的地图.HashMap::get 返回的共享引用指向映射内部的项,这意味着当它存在时,禁止创建另一个对同一映射的可变引用,这是 HashMap::insert.对于要编译的代码,您需要拆分匹配项,以便在调用 insert 之前强制共享引用超出范围:

As you likely discovered yourself, simply removing .clone() doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of match. The shared reference returned by HashMap::get points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by HashMap::insert. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before insert is invoked:

// avoids unnecessary clone of the whole map
fn value(&mut self, arg: K) -> V {
    if let Some(v) = self.values.get(&arg).map(V::clone) {
        return v;
    } else {
        let v = (self.calculation)(arg.clone());
        self.values.insert(arg, v.clone());
        v
    }
}

<小时>

对于大多数实际用途来说,这要好得多,并且可能足够好".值已经被缓存的热路径现在只包含一个克隆,而这个克隆实际上是必要的,因为原始值必须保留在哈希映射中.(另外,请注意克隆不需要昂贵或意味着深度复制 - 存储的值可以是 Rc,它免费购买对象共享.在这种情况下,clone() 只会增加对象的引用计数.)


This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an Rc<RealValue>, which buys object sharing for free. In that case, clone() will simply increment the reference count on the object.)

在缓存未命中的情况下,必须克隆密钥,因为calculation 被声明为使用它.不过,一次克隆就足够了,因此我们可以将原始 arg 传递给 insert 而无需再次克隆它.尽管如此,密钥克隆仍然没有必要——计算函数不应该要求拥有它正在转换的密钥.删除此克隆归结为修改计算函数的签名以通过引用获取密钥.将 T 的特征边界更改为 T:Fn(&K) ->V 允许 value() 的以下公式:

In case of cache miss, the key must be cloned, because calculation is declared to consume it. A single cloning will be sufficient, though, so we can pass the original arg to insert without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of T to T: Fn(&K) -> V allows the following formulation of value():

// avoids unnecessary clone of the key
fn value(&mut self, arg: K) -> V {
    if let Some(v) = self.values.get(&arg).map(V::clone) {
        return v;
    } else {
        let v = (self.calculation)(&arg);
        self.values.insert(arg, v.clone());
        v
    }
}

避免双重查找

现在只剩下两个对 clone() 的调用,每个代码路径中一个.这是最佳的,就值克隆而言,但细心的读者仍然会被一个细节所困扰:在缓存未命中的情况下,对于同一个键,哈希表查找将两次有效地发生:一次在调用 HashMap::get 中,然后在 HashMap::insert 中再次调用.如果我们可以重用第一次完成的工作并只执行一次哈希映射查找,那就太好了.这可以通过用 entry() 替换 get()insert() 来实现:

Avoiding double lookups

Now are left with exactly two calls to clone(), one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen twice for the same key: once in the call to HashMap::get, and then once more in HashMap::insert. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing get() and insert() with entry():

// avoids the second lookup on cache miss
fn value(&mut self, arg: K) -> V {
    match self.values.entry(arg) {
        Entry::Occupied(entry) => entry.into_mut(),
        Entry::Vacant(entry) => {
            let v = (self.calculation)(entry.key());
            entry.insert(v)
        }
    }.clone()
}

我们还借此机会在比赛结束后移动了 .clone() 调用.

We've also taken the opportunity to move the .clone() call after the match.

可运行示例 在操场上.

这篇关于如何从缓存任意结果的结构中删除过多的 `clone` 调用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆