如何在安全的 Rust 中表达相互递归的数据结构? [英] How do I express mutually recursive data structures in safe Rust?

查看:27
本文介绍了如何在安全的 Rust 中表达相互递归的数据结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Rust 中实现类似场景图的数据结构.我想要一个等效于用 safe Rust 表示的 C++ 代码:

struct Node{节点*父;//应该是可变的,并且可以为空(没有父级)std::vector<节点*>孩子;虚拟 ~Node(){for(auto it = child.begin(); it != child.end(); ++it){删除它;}}void addNode(Node* newNode){if (newNode->parent){newNode->parent.child.erase(newNode->parent.child.find(newNode));}newNode->parent = this;child.push_back(newNode);}}

我想要的属性:

  • 父级拥有其子级的所有权
  • 节点可以通过某种引用从外部访问
  • 接触一个Node的事件可能会改变整棵树

解决方案

Rust 试图通过禁止您做可能不安全的事情来确保内存安全.由于此分析是在编译时执行的,因此编译器只能对已知安全的操作子集进行推理.

在 Rust 中,您可以轻松地存储对父节点的引用(通过借用父节点,从而防止突变)子节点列表(通过拥有它们,这给了你更多的自由),但不是两者(不使用 unsafe).这对于您的 addNode 实现来说尤其成问题,它需要对给定节点的父节点进行可变访问.但是,如果您将 parent 指针存储为可变引用,那么,由于一次可能只能使用对特定对象的单个可变引用,因此访问父对象的唯一方法是通过子节点,你只能有一个子节点,否则你会有两个对同一个父节点的可变引用.

如果你想避免不安全的代码,有很多选择,但它们都需要一些牺牲.

<小时>

最简单的解决方案是简单地删除 parent 字段.我们可以定义一个单独的数据结构来在遍历树时记住节点的父节点,而不是将其存储在节点本身中.

首先,让我们定义Node:

#[派生(调试)]结构节点{数据:T,孩子:Vec<Node<T>,}实施<T>节点{fn new(data: T) ->节点{节点{数据:数据,孩子:vec![]}}fn add_child(&mut self, child: Node) {self.children.push(child);}}

(我添加了一个 data 字段,因为如果节点没有数据,树不是很有用!)

现在让我们定义另一个结构来在我们导航时跟踪父级:

#[派生(调试)]struct NavigableNode<'a, T:'a>{节点:&'a节点<T>,父级:选项<&'a NavigableNode<'a, T>>,}impl'a,T>NavigableNode<'a,T>{fn child(&self, index: usize) ->NavigableNode{导航节点{节点:&self.node.children[index],父母:一些(自己)}}}实施<T>节点{fnnavigate<'a>(&'a self) ->NavigableNode{NavigableNode { 节点:自我,父:无}}}

如果您在导航时不需要改变树,并且您可以保留父 NavigableNode 对象(这对于递归算法工作正常,但不是如果您想将 NavigableNode 存储在其他一些数据结构中并保留它,则效果很好).第二个限制可以通过使用借用指针以外的东西来存储父对象来缓解;如果你想要最大的通用性,你可以使用 Borrow trait 允许直接值、借用指针、Boxes、Rc

<小时>

现在,让我们谈谈拉链.在函数式编程中,zippers 用于关注"数据结构的特定元素(列表、树、映射等),以便访问该元素需要恒定的时间,同时仍保留该数据结构的所有数据.如果您需要导航树并在导航过程中变异,同时保留向上导航树的能力,那么您可以将树变成拉链并通过拉链执行修改.>

以下是我们为上面定义的 Node 实现拉链的方法:

#[derive(Debug)]struct NodeZipper;{节点:节点,父级:Option>,index_in_parent:使用,}实施<T>NodeZipper<T>{fn child(mut self, index: usize) ->NodeZipper<T>{//从节点的子节点中移除指定的子节点.//NodeZipper 不应该让用户检查它的父节点,//因为我们改变了父母//将焦点节点移出其子节点列表.//我们使用 swap_remove() 来提高效率.让孩子 = self.node.children.swap_remove(index);//返回一个新的 NodeZipper,专注于指定的孩子.节点拉链{节点:孩子,父母:一些(盒子::新(自我)),index_in_parent:索引,}}fn 父母(自我)->NodeZipper<T>{//解构这个 NodeZipper让 NodeZipper { node, parent, index_in_parent } = self;//解构父NodeZipper让 NodeZipper {节点:mut parent_node,父母:parent_parent,index_in_parent: parent_index_in_parent,} = *parent.unwrap();//将此 NodeZipper 的节点插入回其父节点.//由于我们使用了 swap_remove() 来移除孩子,//我们需要做相反的事情.parent_node.children.push(节点);让 len = parent_node.children.len();parent_node.children.swap(index_in_parent, len - 1);//返回一个专注于父节点的新 NodeZipper.节点拉链{节点:parent_node,父母:parent_parent,index_in_parent: parent_index_in_parent,}}fn 完成(mut self)->节点{而让 Some(_) = self.parent {self = self.parent();}自节点}}实施<T>节点{fn 拉链(自我)->NodeZipper<T>{NodeZipper { node: self, parent: None, index_in_parent: 0 }}}

要使用这个拉链,你需要拥有树的根节点的所有权.通过取得节点的所有权,拉链可以移动事物以避免复制或克隆节点.当我们移动一个拉链时,我们实际上是放下旧的拉链并创建一个新的拉链(虽然我们也可以通过改变 self 来做到这一点,但我认为这样更清晰,而且它可以让你链接方法电话).

<小时>

如果以上选项都不令人满意,并且必须绝对将节点的父节点存储在节点中,那么下一个最佳选择是使用Rc>>Rc> 引用父节点,Weak>> 引用子节点.Rc 启用共享所有权,但增加了在运行时执行引用计数的开销.RefCell 启用内部可变性,但会增加开销以在运行时跟踪活动借用.Weak 就像 Rc,但它不会增加引用计数;这用于中断引用循环,这将防止引用计数降至零,从而导致内存泄漏.有关使用 RcWeak 的实现,请参阅 DK. 的答案> 和 RefCell.

I am trying to implement a scenegraph-like data structure in Rust. I would like an equivalent to this C++ code expressed in safe Rust:

struct Node
{
    Node* parent; // should be mutable, and nullable (no parent)
    std::vector<Node*> child;

    virtual ~Node() 
    { 
        for(auto it = child.begin(); it != child.end(); ++it)
        {
            delete *it;
        }
    }

    void addNode(Node* newNode)
    {
        if (newNode->parent)
        {
            newNode->parent.child.erase(newNode->parent.child.find(newNode));
        }
        newNode->parent = this;
        child.push_back(newNode);
    }
}

Properties I want:

  • the parent takes ownership of its children
  • the nodes are accessible from the outside via a reference of some kind
  • events that touch one Node can potentially mutate the whole tree

解决方案

Rust tries to ensure memory safety by forbidding you from doing things that might potentially be unsafe. Since this analysis is performed at compile-time, the compiler can only reason about a subset of manipulations that are known to be safe.

In Rust, you could easily store either a reference to the parent (by borrowing the parent, thus preventing mutation) or the list of child nodes (by owning them, which gives you more freedom), but not both (without using unsafe). This is especially problematic for your implementation of addNode, which requires mutable access to the given node's parent. However, if you store the parent pointer as a mutable reference, then, since only a single mutable reference to a particular object may be usable at a time, the only way to access the parent would be through a child node, and you'd only be able to have a single child node, otherwise you'd have two mutable references to the same parent node.

If you want to avoid unsafe code, there are many alternatives, but they'll all require some sacrifices.


The easiest solution is to simply remove the parent field. We can define a separate data structure to remember the parent of a node while we traverse a tree, rather than storing it in the node itself.

First, let's define Node:

#[derive(Debug)]
struct Node<T> {
    data: T,
    children: Vec<Node<T>>,
}

impl<T> Node<T> {
    fn new(data: T) -> Node<T> {
        Node { data: data, children: vec![] }
    }

    fn add_child(&mut self, child: Node<T>) {
        self.children.push(child);
    }
}

(I added a data field because a tree isn't super useful without data at the nodes!)

Let's now define another struct to track the parent as we navigate:

#[derive(Debug)]
struct NavigableNode<'a, T: 'a> {
    node: &'a Node<T>,
    parent: Option<&'a NavigableNode<'a, T>>,
}

impl<'a, T> NavigableNode<'a, T> {
    fn child(&self, index: usize) -> NavigableNode<T> {
        NavigableNode {
            node: &self.node.children[index],
            parent: Some(self)
        }
    }
}

impl<T> Node<T> {
    fn navigate<'a>(&'a self) -> NavigableNode<T> {
        NavigableNode { node: self, parent: None }
    }
}

This solution works fine if you don't need to mutate the tree as you navigate it and you can keep the parent NavigableNode objects around (which works fine for a recursive algorithm, but doesn't work too well if you want to store a NavigableNode in some other data structure and keep it around). The second restriction can be alleviated by using something other than a borrowed pointer to store the parent; if you want maximum genericity, you can use the Borrow trait to allow direct values, borrowed pointers, Boxes, Rc's, etc.


Now, let's talk about zippers. In functional programming, zippers are used to "focus" on a particular element of a data structure (list, tree, map, etc.) so that accessing that element takes constant time, while still retaining all the data of that data structure. If you need to navigate your tree and mutate it during the navigation, while retaining the ability to navigate up the tree, then you could turn a tree into a zipper and perform the modifications through the zipper.

Here's how we could implement a zipper for the Node defined above:

#[derive(Debug)]
struct NodeZipper<T> {
    node: Node<T>,
    parent: Option<Box<NodeZipper<T>>>,
    index_in_parent: usize,
}

impl<T> NodeZipper<T> {
    fn child(mut self, index: usize) -> NodeZipper<T> {
        // Remove the specified child from the node's children.
        // A NodeZipper shouldn't let its users inspect its parent,
        // since we mutate the parents
        // to move the focused nodes out of their list of children.
        // We use swap_remove() for efficiency.
        let child = self.node.children.swap_remove(index);

        // Return a new NodeZipper focused on the specified child.
        NodeZipper {
            node: child,
            parent: Some(Box::new(self)),
            index_in_parent: index,
        }
    }

    fn parent(self) -> NodeZipper<T> {
        // Destructure this NodeZipper
        let NodeZipper { node, parent, index_in_parent } = self;

        // Destructure the parent NodeZipper
        let NodeZipper {
            node: mut parent_node,
            parent: parent_parent,
            index_in_parent: parent_index_in_parent,
        } = *parent.unwrap();

        // Insert the node of this NodeZipper back in its parent.
        // Since we used swap_remove() to remove the child,
        // we need to do the opposite of that.
        parent_node.children.push(node);
        let len = parent_node.children.len();
        parent_node.children.swap(index_in_parent, len - 1);

        // Return a new NodeZipper focused on the parent.
        NodeZipper {
            node: parent_node,
            parent: parent_parent,
            index_in_parent: parent_index_in_parent,
        }
    }

    fn finish(mut self) -> Node<T> {
        while let Some(_) = self.parent {
            self = self.parent();
        }

        self.node
    }
}

impl<T> Node<T> {
    fn zipper(self) -> NodeZipper<T> {
        NodeZipper { node: self, parent: None, index_in_parent: 0 }
    }
}

To use this zipper, you need to have ownership of the root node of the tree. By taking ownership of the nodes, the zipper can move things around in order to avoid copying or cloning nodes. When we move a zipper, we actually drop the old zipper and create a new one (though we could also do it by mutating self, but I thought it was clearer that way, plus it lets you chain method calls).


If the above options are not satisfactory, and you must absolutely store the parent of a node in a node, then the next best option is to use Rc<RefCell<Node<T>>> to refer to the parent and Weak<RefCell<Node<T>>> to the children. Rc enables shared ownership, but adds overhead to perform reference counting at runtime. RefCell enables interior mutability, but adds overhead to keep track of the active borrows at runtime. Weak is like Rc, but it doesn't increment the reference count; this is used to break reference cycles, which would prevent the reference count from dropping to zero, causing a memory leak. See DK.'s answer for an implementation using Rc, Weak and RefCell.

这篇关于如何在安全的 Rust 中表达相互递归的数据结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆