是否可以在不分配新 Vec 的情况下将函数映射到 Vec? [英] Is it possible to map a function over a Vec without allocating a new Vec?
问题描述
我有以下几点:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
我现在想对 data
的每个元素调用 transform
并将结果值存储回 data
.我可以做一些类似 data.into_iter().map(transform).collect()
的事情,但这会分配一个新的 Vec
.有没有办法就地执行此操作,重用 data
的分配内存?在 Rust 中曾经有 Vec::map_in_place
但它已经被删除了一段时间.
I would now like to call transform
on each element of data
and store the resulting value back in data
. I could do something like data.into_iter().map(transform).collect()
, but this will allocate a new Vec
. Is there a way to do this in-place, reusing the allocated memory of data
? There once was Vec::map_in_place
in Rust but it has been removed some time ago.
作为一种变通方法,我向 SomeType
添加了一个 Dummy
变体,然后执行以下操作:
As a work-around, I've added a Dummy
variant to SomeType
and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
这感觉不对,我必须在代码中的其他任何地方处理 SomeType::Dummy
,尽管它不应该在这个循环之外可见.有没有更好的方法来做到这一点?
This does not feel right, and I have to deal with SomeType::Dummy
everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
推荐答案
不,一般来说这是不可能的,因为每个元素的大小可能会随着映射的执行而改变 (fn transform(u8) ->u32
).
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32
).
即使大小相同,也很重要.
Even when the sizes are the same, it's non-trivial.
在这种情况下,你不需要创建一个 Dummy
变体,因为创建一个空的 String
很便宜;只有 3 个指针大小的值且没有堆分配:
In this case, you don't need to create a Dummy
variant because creating an empty String
is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
替代String
的替代实现:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
一般来说,是的,您必须创建一些虚拟值来一般地使用安全代码执行此操作.很多时候,您可以将整个元素包装在 Option
中并调用 Option::take
来达到相同的效果.
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option
and call Option::take
to achieve the same effect .
另见:
请参阅此提议且现已关闭的 RFC 以了解大量相关讨论.我对该 RFC(以及其背后的复杂性)的理解是,在某个时间段内,您的值将具有未定义的值,这不安全.如果在那一刻发生恐慌,那么当你的价值下降时,你可能会触发未定义的行为,这是一件坏事.
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
如果您的代码在注释行处发生恐慌,那么 self
的值是一个具体的、已知的值.如果它是某个未知值,则删除该字符串将尝试删除该未知值,而我们又回到了 C 中.这就是 Dummy
值的目的 - 始终存储一个已知良好的值.
If your code were to panic at the commented line, then the value of self
is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy
value - to always have a known-good value stored.
你甚至暗示了这一点(强调我的):
You even hinted at this (emphasis mine):
我必须处理代码中其他任何地方的 SomeType::Dummy
,尽管它应该在这个循环之外永远不可见
I have to deal with
SomeType::Dummy
everywhere else in the code, although it should never be visible outside of this loop
应该"就是问题所在.在恐慌期间,该虚拟值可见.
That "should" is the problem. During a panic, that dummy value is visible.
另见:
现在-删除了 Vec::map_in_place
的实现跨越了近 175 行代码,其中大部分必须处理不安全的代码并推理为什么它实际上是安全的!一些 crate 重新实现了这个概念并试图使其安全;您可以在 Sebastian Redl 的回答中看到一个示例.
这篇关于是否可以在不分配新 Vec 的情况下将函数映射到 Vec?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!