可互换地使用str和String [英] Using str and String interchangably

查看:39
本文介绍了可互换地使用str和String的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我正在尝试使用& str 在Rust中创建一个精美的零拷贝解析器,但是有时我需要修改文本(例如,实现变量替换).我真的很想做这样的事情:

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:

fn main() {
    let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();

    for t in v.iter_mut() {
        if (t.contains("$world")) {
            *t = &t.replace("$world", "Earth");
        }
    }

    println!("{:?}", &v);
}

但是,当然,由 t.replace()返回的 String 寿命不足.有没有解决这个问题的好方法?也许有一种类型的意思是理想情况下是& str ,但必要时是 String "?或者,也许有一种方法可以使用生存期批注来告诉编译器,返回的 String 应该保持活动状态,直到 main()结束(或与 v )?

But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?

推荐答案

Rust具有您想要的

Rust has exactly what you want in form of a Cow (Clone On Write) type.

use std::borrow::Cow;

fn main() {
    let mut v: Vec<_> = "Hello there $world!".split_whitespace()
                                             .map(|s| Cow::Borrowed(s))
                                             .collect();

    for t in v.iter_mut() {
        if t.contains("$world") {
            *t.to_mut() = t.replace("$world", "Earth");
        }
    }

    println!("{:?}", &v);
}

正如@sellibitze正确指出的那样, to_mut()创建一个新的 String ,这将导致堆分配存储以前的借入值.如果确定只有借用的字符串,则可以使用

as @sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use

*t = Cow::Owned(t.replace("$world", "Earth"));


如果Vec包含 Cow :: Owned 元素,则仍会丢弃分配.您可以防止使用以下非常脆弱和不安全代码(它确实对UTF-8字符串进行基于字节的直接操作,并且依赖于替换恰好是相同数量的字节这一事实.)在您的for循环中.


In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.

let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
    let p = pos + last_pos; // find always starts at last_pos
    last_pos = pos + 5;
    unsafe {
        let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
        s.remove(p); // remove $ sign
        for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
            *sc = c;
        }
    }
}

请注意,这是完全根据"$ world"->地球"映射量身定制的.任何其他映射都需要在不安全的代码内进行仔细考虑.

Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.

这篇关于可互换地使用str和String的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆