如何在安全Rust中表达相互递归的数据结构? [英] How do I express mutually recursive data structures in safe Rust?

查看:124
本文介绍了如何在安全Rust中表达相互递归的数据结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Rust中实现类似于场景图的数据结构。我想要一个等效于用 safe Rust表示的C ++代码:

I am trying to implement a scenegraph-like data structure in Rust. I would like an equivalent to this C++ code expressed in safe Rust:

struct Node
{
    Node* parent; // should be mutable, and nullable (no parent)
    std::vector<Node*> child;

    virtual ~Node() 
    { 
        for(auto it = child.begin(); it != child.end(); ++it)
        {
            delete *it;
        }
    }

    void addNode(Node* newNode)
    {
        if (newNode->parent)
        {
            newNode->parent.child.erase(newNode->parent.child.find(newNode));
        }
        newNode->parent = this;
        child.push_back(newNode);
    }
}

我想要的属性:


  • 父母拥有其孩子的所有权

  • 可通过某种引用从外部访问节点

  • 碰到一个 Node 的事件可能会改变整棵树

  • the parent takes ownership of its children
  • the nodes are accessible from the outside via a reference of some kind
  • events that touch one Node can potentially mutate the whole tree

推荐答案

Rust试图通过禁止您执行可能不安全的操作来确保内存安全。由于此分析是在编译时执行的,因此编译器只能推断出已知安全的一部分操作。

Rust tries to ensure memory safety by forbidding you from doing things that might potentially be unsafe. Since this analysis is performed at compile-time, the compiler can only reason about a subset of manipulations that are known to be safe.

在Rust中,您可以轻松存储或者对父级的引用(通过借用父级,从而防止突变)子级节点列表(通过拥有它们,这会给您更多自由),但是不是全部(不使用不安全)。这对于 addNode 的实现尤其成问题,因为它需要对给定节点的父级进行可变访问。但是,如果将 parent 指针存储为可变引用,则由于一次只能使用对特定对象的单个可变引用,因此访问该唯一方法父节点将通过一个子节点,并且您只能有一个子节点,否则您将有两个对同一父节点的可变引用。

In Rust, you could easily store either a reference to the parent (by borrowing the parent, thus preventing mutation) or the list of child nodes (by owning them, which gives you more freedom), but not both (without using unsafe). This is especially problematic for your implementation of addNode, which requires mutable access to the given node's parent. However, if you store the parent pointer as a mutable reference, then, since only a single mutable reference to a particular object may be usable at a time, the only way to access the parent would be through a child node, and you'd only be able to have a single child node, otherwise you'd have two mutable references to the same parent node.

如果您想避免使用不安全的代码,可以有很多选择,但是它们都会需要一些牺牲。

If you want to avoid unsafe code, there are many alternatives, but they'll all require some sacrifices.

最简单的解决方案是只删除 parent 字段。我们可以定义一个单独的数据结构,以在遍历树时记住节点的父级,而不是将其存储在节点本身中。

The easiest solution is to simply remove the parent field. We can define a separate data structure to remember the parent of a node while we traverse a tree, rather than storing it in the node itself.

首先,让我们定义 Node

#[derive(Debug)]
struct Node<T> {
    data: T,
    children: Vec<Node<T>>,
}

impl<T> Node<T> {
    fn new(data: T) -> Node<T> {
        Node { data: data, children: vec![] }
    }

    fn add_child(&mut self, child: Node<T>) {
        self.children.push(child);
    }
}

(我添加了数据字段,因为如果没有节点处的数据,树就没有多用!)

(I added a data field because a tree isn't super useful without data at the nodes!)

现在让我们定义另一个结构以在导航时跟踪父级:

Let's now define another struct to track the parent as we navigate:

#[derive(Debug)]
struct NavigableNode<'a, T: 'a> {
    node: &'a Node<T>,
    parent: Option<&'a NavigableNode<'a, T>>,
}

impl<'a, T> NavigableNode<'a, T> {
    fn child(&self, index: usize) -> NavigableNode<T> {
        NavigableNode {
            node: &self.node.children[index],
            parent: Some(self)
        }
    }
}

impl<T> Node<T> {
    fn navigate<'a>(&'a self) -> NavigableNode<T> {
        NavigableNode { node: self, parent: None }
    }
}

如果您不需要在导航树时对其进行变异,并且可以保留父 NavigableNode 对象(对于一个递归算法,但是如果您想在其他数据结构中存储 NavigableNode 并保持不变),效果就不太好。第二种限制可以通过使用除借入指针之外的其他东西来存储父对象来缓解。如果您想获得最大的通用性,则可以使用 借用特征以允许直接值,借用指针, Box es, Rc 等。

This solution works fine if you don't need to mutate the tree as you navigate it and you can keep the parent NavigableNode objects around (which works fine for a recursive algorithm, but doesn't work too well if you want to store a NavigableNode in some other data structure and keep it around). The second restriction can be alleviated by using something other than a borrowed pointer to store the parent; if you want maximum genericity, you can use the Borrow trait to allow direct values, borrowed pointers, Boxes, Rc's, etc.

现在,让我们来谈谈拉链。在函数式编程中,拉链用于关注数据结构的特定元素(列表,树,地图等),以便访问该元素需要花费固定时间,同时仍保留该数据结构的所有数据。如果您需要导航树并在导航过程中对其进行变异,同时又保留了向上导航树的功能,则可以将树变成拉链并通过拉链进行修改。

Now, let's talk about zippers. In functional programming, zippers are used to "focus" on a particular element of a data structure (list, tree, map, etc.) so that accessing that element takes constant time, while still retaining all the data of that data structure. If you need to navigate your tree and mutate it during the navigation, while retaining the ability to navigate up the tree, then you could turn a tree into a zipper and perform the modifications through the zipper.

这是我们为上面定义的 Node 实施拉链的方法:

Here's how we could implement a zipper for the Node defined above:

#[derive(Debug)]
struct NodeZipper<T> {
    node: Node<T>,
    parent: Option<Box<NodeZipper<T>>>,
    index_in_parent: usize,
}

impl<T> NodeZipper<T> {
    fn child(mut self, index: usize) -> NodeZipper<T> {
        // Remove the specified child from the node's children.
        // A NodeZipper shouldn't let its users inspect its parent,
        // since we mutate the parents
        // to move the focused nodes out of their list of children.
        // We use swap_remove() for efficiency.
        let child = self.node.children.swap_remove(index);

        // Return a new NodeZipper focused on the specified child.
        NodeZipper {
            node: child,
            parent: Some(Box::new(self)),
            index_in_parent: index,
        }
    }

    fn parent(self) -> NodeZipper<T> {
        // Destructure this NodeZipper
        let NodeZipper { node, parent, index_in_parent } = self;

        // Destructure the parent NodeZipper
        let NodeZipper {
            node: mut parent_node,
            parent: parent_parent,
            index_in_parent: parent_index_in_parent,
        } = *parent.unwrap();

        // Insert the node of this NodeZipper back in its parent.
        // Since we used swap_remove() to remove the child,
        // we need to do the opposite of that.
        parent_node.children.push(node);
        let len = parent_node.children.len();
        parent_node.children.swap(index_in_parent, len - 1);

        // Return a new NodeZipper focused on the parent.
        NodeZipper {
            node: parent_node,
            parent: parent_parent,
            index_in_parent: parent_index_in_parent,
        }
    }

    fn finish(mut self) -> Node<T> {
        while let Some(_) = self.parent {
            self = self.parent();
        }

        self.node
    }
}

impl<T> Node<T> {
    fn zipper(self) -> NodeZipper<T> {
        NodeZipper { node: self, parent: None, index_in_parent: 0 }
    }
}

要使用此拉链,您需要拥有树的根节点的所有权。通过获得节点的所有权,拉链可以四处移动,以避免复制或克隆节点。当我们移动一个拉链时,我们实际上放下了旧的拉链并创建了一个新的拉链(尽管我们也可以通过对 self 进行突变来做到这一点,但是我认为这样更清楚,加上它可以让您链接方法调用。)

To use this zipper, you need to have ownership of the root node of the tree. By taking ownership of the nodes, the zipper can move things around in order to avoid copying or cloning nodes. When we move a zipper, we actually drop the old zipper and create a new one (though we could also do it by mutating self, but I thought it was clearer that way, plus it lets you chain method calls).

如果上述选项不令人满意,则必须绝对存储父级节点中的一个节点,那么下一个最佳选择是使用 Rc c来引用父节点,使用对孩子来说弱的RefCell< Node< T>> Rc 启用共享所有权,但增加了开销,以便在运行时执行引用计数。 RefCell 启用内部可变性,但增加了开销以在运行时跟踪活动借用。 类似于 Rc ,但是它不会增加引用计数。这用于中断参考周期,这将防止参考计数下降到零,从而导致内存泄漏。 有关使用 Rc 的实现,请参见DK的答案 RefCell

If the above options are not satisfactory, and you must absolutely store the parent of a node in a node, then the next best option is to use Rc<RefCell<Node<T>>> to refer to the parent and Weak<RefCell<Node<T>>> to the children. Rc enables shared ownership, but adds overhead to perform reference counting at runtime. RefCell enables interior mutability, but adds overhead to keep track of the active borrows at runtime. Weak is like Rc, but it doesn't increment the reference count; this is used to break reference cycles, which would prevent the reference count from dropping to zero, causing a memory leak. See DK.'s answer for an implementation using Rc, Weak and RefCell.

这篇关于如何在安全Rust中表达相互递归的数据结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆