我可以通过持有引用来避免克隆字符串吗? [英] Can I avoid cloning Strings by holding a reference instead?

查看:62
本文介绍了我可以通过持有引用来避免克隆字符串吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据结构,其中我在读取缓冲区周围提供了包装器,以自动处理读出中的重复语句.

I have a data structure in which I am providing a wrapper around a read buffer to automatically handle repeating statements in the readout.

这是通过存储剩余多少重复和要重复的行的内部状态来完成的.

This is done by storing an internal state of how many repeats are left and the line to be repeated.

use std::fs::File;
use std::path::Path;
use std::io::BufReader;
use std::io::prelude::*;
use std::io::Error;
use std::num::NonZeroU32;
use std::mem;

pub struct Reader {
    handle: BufReader<File>,
    repeat_state: Option<(NonZeroU32, String)>,
}

impl Reader {
    pub fn new<P: AsRef<Path>>(path: P) -> Result<Reader, Error> {
        let file = File::open(path)?;
        let handle = BufReader::new(file);

        Ok(Reader {
            handle,
            repeat_state: None,
        })
    }

    /// get next line, respecting repeat instructions
    pub fn next_line(&mut self) -> Option<String> {
        if self.repeat_state.is_some() {
            let (repeats_left, last_line) = mem::replace(&mut self.repeat_state, None).unwrap();

            self.repeat_state = NonZeroU32::new(repeats_left.get() - 1)
                .map(|repeats_left| (repeats_left, last_line.clone()));

            Some(last_line)
        } else {
            let mut line = String::new();
            if self.handle.read_line(&mut line).is_err() || line.is_empty() {
                return None
            }

            if line.starts_with("repeat ") {
                let repeats: Option<u32> = line.chars().skip(7)
                    .take_while(|c| c.is_numeric())
                    .collect::<String>().parse().ok();

                self.repeat_state = repeats
                    .and_then(|repeats| NonZeroU32::new(repeats - 1))
                    .map(|repeats_left| (repeats_left, line.clone()))
            }

            Some(line)
        }
    }
}

#[test]
fn test_next_line() {
    let source = "
line one
repeat 2    line two and line three
line four
repeat 11   lines 5-15
line 16
line 17
last line (18)
    ".trim();
    let mut input = File::create("file.txt").unwrap();
    write!(input, "{}", source);


    let mut read = Reader::new("file.txt").unwrap();
    assert_eq!(
        read.next_line(),
        Some("line one\n".to_string())
    );
    assert_eq!(
        read.next_line(),
        Some("repeat 2    line two and line three\n".to_string())
    );
    assert_eq!(
        read.next_line(),
        Some("repeat 2    line two and line three\n".to_string())
    );
    assert_eq!(
        read.next_line(),
        Some("line four\n".to_string())
    );

    for _ in 5..=15 {
        assert_eq!(
            read.next_line(),
            Some("repeat 11   lines 5-15\n".to_string())
        );
    }

    assert_eq!(
        read.next_line(),
        Some("line 16\n".to_string())
    );
    assert_eq!(
        read.next_line(),
        Some("line 17\n".to_string())
    );
    assert_eq!(
        read.next_line(),
        Some("last line (18)".to_string())
    );
}

游乐场

问题是我必须每次克隆保留的重复值才能同时保留并返回它.我想通过返回(并存储)&str来避免这些昂贵的克隆.我已经尝试了几种方法,但是无法使其正常工作:

The problem is that I have to clone the held repeated value every time in order to both hold onto it and return it. I want to avoid these costly clones by returning (and maybe storing) a &str. I've tried several things, but was unable to get it to work:

  • 存储String,返回&str:寿命不长"寿命错误
  • 存储&str,返回&str:相同的生命周期错误
  • Cow<&str>
  • Box<&str>
  • Storing String, returning &str: "does not live long enough" lifetime errors
  • Storing &str, returning &str: same lifetime errors
  • Cow<&str>
  • Box<&str>

根据CodeXL基于时间的采样探查器,这些克隆现在是我程序的瓶颈,它是在带有调试信息的发布模式下构建的.现在,我的程序已经足够快了,但是我想知道是否有避免它们的方法.

These clones are the bottleneck of my program at the moment, according to the CodeXL time-based sampling profiler after building in release mode with debug info. Now, my program is plenty fast as it is, but I'm wondering if there is a way of avoiding them.

推荐答案

通过将字符串包装在Rc中并进行克隆,可以避免克隆字符串.克隆Rc很便宜,因为它包括增加一个计数器:

You can avoid cloning the strings by wrapping them in an Rc and cloning that instead. Cloning an Rc is cheap since it consists of incrementing a counter:

pub struct Reader {
    handle: BufReader<File>,
    repeat_state: Option<(NonZeroU32, Rc<String>)>,
}

impl Reader {
    pub fn new<P: AsRef<Path>>(path: P) -> Result<Reader, Error> {
        let file = File::open(path)?;
        let handle = BufReader::new(file);

        Ok(Reader {
            handle,
            repeat_state: None,
        })
    }

    /// get next line, respecting repeat instructions
    pub fn next_line(&mut self) -> Option<Rc<String>> {
        if self.repeat_state.is_some() {
            let (repeats_left, last_line) = mem::replace(&mut self.repeat_state, None).unwrap();

            self.repeat_state = NonZeroU32::new(repeats_left.get() - 1)
                .map(|repeats_left| (repeats_left, last_line.clone()));

            Some(last_line)
        } else {
            let mut line = Rc::new (String::new());
            if self.handle.read_line(Rc::make_mut (&mut line)).is_err() || line.is_empty() {
                return None
            }

            if line.starts_with("repeat ") {
                let repeats: Option<u32> = line.chars().skip(7)
                    .take_while(|c| c.is_numeric())
                    .collect::<String>().parse().ok();

                self.repeat_state = repeats
                    .and_then(|repeats| NonZeroU32::new(repeats - 1))
                    .map(|repeats_left| (repeats_left, line.clone()))
            }

            Some(line)
        }
    }
}

操场

请注意,Rc不能在多个线程之间共享.如果要在线程之间共享字符串,则可以使用 Arc .

Note that Rc cannot be shared between multiple threads. If you want to share the strings between threads, you can use Arc instead.

这篇关于我可以通过持有引用来避免克隆字符串吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆