可互换地使用str和String [英] Using str and String interchangably
问题描述
假设我正在尝试使用& str
在Rust中创建一个精美的零拷贝解析器,但是有时我需要修改文本(例如,实现变量替换).我真的很想做这样的事情:
Suppose I'm trying to do a fancy zero-copy parser in Rust using &str
, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
但是,当然,由 t.replace()
返回的 String
寿命不足.有没有解决这个问题的好方法?也许有一种类型的意思是理想情况下是& str
,但必要时是 String
"?或者,也许有一种方法可以使用生存期批注来告诉编译器,返回的 String
应该保持活动状态,直到 main()
结束(或与 v
)?
But of course the String
returned by t.replace()
doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str
but if necessary a String
"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String
should be kept alive until the end of main()
(or have the same lifetime as v
)?
推荐答案
Rust has exactly what you want in form of a Cow
(Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
正如@sellibitze正确指出的那样, to_mut()
创建一个新的 String
,这将导致堆分配存储以前的借入值.如果确定只有借用的字符串,则可以使用
as @sellibitze correctly notes, the to_mut()
creates a new String
which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
如果Vec包含 Cow :: Owned
元素,则仍会丢弃分配.您可以防止使用以下非常脆弱和不安全代码(它确实对UTF-8字符串进行基于字节的直接操作,并且依赖于替换恰好是相同数量的字节这一事实.)在您的for循环中.
In case the Vec contains Cow::Owned
elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
请注意,这是完全根据"$ world"->地球"映射量身定制的.任何其他映射都需要在不安全的代码内进行仔细考虑.
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
这篇关于可互换地使用str和String的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!