以排序顺序获取树的所有树叶 [英] Getting all leaves of a tree in a sorted order

查看:134
本文介绍了以排序顺序获取树的所有树叶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于如下的树结构

  public class Node implements Comparable&Node {
private List< Node> nodes = new ArrayList< Node>();
private String name =;
私人列表< String> leaves = new ArrayList< String>();
private Node parent = null;

public List< Node> getNodes(){
return nodes;
}

public void setNodes(List< Node> nodes){
this.nodes = nodes;
}

public List< String> getLeaves(){
return leaves;
}

public void setLeaves(List< String> leaves){
this.leaves =
}

@Override
public int compareTo(Node o){
return this.getName()。compareTo(o.getName());
}

public String getName(){
return name;
}

public void setName(String name){
this.name = name;
}

public Node getParent(){
return parent;
}

public void setParent(Node parent){
this.parent = parent;
}

public int getDepth(){
int depth = 0;
Node parent = this.getParent();
while(parent!= null){
depth ++;
parent = parent.getParent();
}
返回深度;
}
}

从一个节点,我希望有一个方法返回所有不同的直接和间接叶子(在上述情况下,排序顺序为该节点的字符串离开将为叶子)。



以上是一个高度的高度数据结构,易于测试和演示。我尝试了以下3种方法,



方法A
深度大到20时非常慢,因为最深的叶片遍历了几次,一次为它的每个祖先,因此相同的路径遍历多次。

  public List< String> getLeavesDeep1(){
Set< String> leaves = new TreeSet< String>();
leaves.addAll(getLeaves());
for(Node node:getNodes()){
leaves.addAll(node.getLeavesDeep1());
}
return new ArrayList< String>(leaves);
}

平均:12694毫秒/无排序/不同>平均:471毫秒

方法B
比A快一点,因为节点的数量比叶少得多,所以使用方法A,但是对于节点,然后是每个节点节点,只得到直接的叶子。

 私人列表< Node> getNodesDeep2(){
Set< Node> nodes = new TreeSet< Node>();
nodes.addAll(getNodes());
for(Node node:getNodes()){
nodes.addAll(node.getNodesDeep2());
}
返回新的ArrayList< Node>(nodes);
}

public List< String> getLeavesDeep2(){
Set< String> leaves = new TreeSet< String>();
leaves.addAll(getLeaves());
for(Node node:getNodesDeep2()){
leaves.addAll(node.getLeaves());
}
return new ArrayList< String>(leaves);
}

平均:4355毫秒/无排序/不同>平均:2406毫秒

方法C
避免使用TreeSet,使用ArrayList和排序&在返回之前过滤(不是最好的排序/清除方式)

 私人列表<节点> getNodesDeep3(){
列表< Node> nodes = new ArrayList&Node;();
nodes.addAll(getNodes());
for(Node node:getNodes()){
nodes.addAll(node.getNodesDeep3());
}
返回新的ArrayList< Node>(new TreeSet< Node>(nodes));
}

public List< String> getLeavesDeep3(){
列表< String> leaves = new ArrayList< String>();
leaves.addAll(getLeaves());
for(Node node:getNodesDeep3()){
leaves.addAll(node.getLeaves());
}
返回新的ArrayList< String>(new TreeSet< String>(leaves));
}

平均:4400



寻找更快的东西,我知道有一些可以使用的树遍历,但如果存在,我更喜欢一些更简单的东西。

这些没有用于当前搜索的用例。在我的真实课堂中,时间要高出上述情况的3倍左右,因为结构比起简单字符串更复杂,而POJO



以下是我曾经习惯过的测试

  private static final int NODES = 5; 
private static final int LEAVES = 25;
private static final int DEPTH = 8;

public void addChildren(Node parent){
列表< Node> nodes = new ArrayList< Node>();
列表< String> leaves = new ArrayList< String>(); (int i = 0; i< LEAVES; i ++){
leaves.add(String.format(%s_leaf_%s,parent.getName(),i));
(int i = 0; i< NODES; i ++){
Node child = new Node();
}

child.setParent(parent);
child.setName(String.format(%s_%s,parent.getName(),i));
nodes.add(child);
if(child.getDepth()&DEPTH){
addChildren(child);
}
}
parent.setNodes(nodes);
parent.setLeaves(leaves);
}

@Test
public void testCase(){
long start,tot = 0;
long t = 0;
列表< String>树叶;
Node target = new Node();
target.setName(Root);
addChildren(target); (int i = 0; i< 10; i ++){
start = System.currentTimeMillis();

leaves = target.getLeavesDeep5();
t = System.currentTimeMillis() - start;
tot + = t;
System.out.println(leaves.size()++ t);
}

System.out.println(Avg:+(tot / 10));
}

任何语言的答案都可以接受,包括伪代码,只要它不'将解决方案与该语言紧密结合(例如:纯Java代码禁止从第二个子句)

解决方案

我运行了你的测试,它给了我以下结果(我使用你的版本3,一个稍微修改版本3和一个新版本)

  2441400 8038 
...
2441400 7890
平均:7872

2441400 4850
...
2441400 3990
平均:4165

2441400 980
...
2441400 710
平均:786

我第一次更改

 返回新的ArrayList< String> TreeSet< String>(叶)); 

  Collections.sort(leaves); 
返回叶;

请参阅添加到集合中是否更快,然后排序,或添加到排序集合?



其中执行时间缩短了近50%。
注意: TreeSet将删除重复项,排序不会。



然后,我写了一个新的迭代器方法,将您的2种方法结合到一起消除递归。我也摆脱了ArrayLists,以避免我们不需要调整大小和复制,因为我们只是迭代而不能通过索引访问。



编辑:使用ArrayList来存储叶子将时间从800ms增加到大约1400ms。

  public List< String> getLeavesDeepX()
{
final Deque< Node> nodes = new LinkedList< Node>();
final Collection< String> leaves = new LinkedList< String>();
// final Collection< String> leaves = new LinkedHashSet< String>(); - 用于删除dupes
nodes.add(this);
do
{
最终节点current = nodes.pop();
leaves.addAll(current.getLeaves());
nodes.addAll(current.getTreeNodes());
}
while(nodes.isEmpty()== false);

final ArrayList< String> result = new ArrayList< String>(leaves);
Collections.sort(result);
返回结果;
}

我将所有结果放入不同的列表中,并将结果进行比较。 p>

  System.out.println(Arrays.equals(leaves1.toArray(),leaves2.toArray())); 
System.out.println(Arrays.equals(leaves1.toArray(),leaves3.toArray()));
System.out.println(Arrays.equals(leaves2.toArray(),leaves3.toArray()));

输出:

 code> true 
true
true

所以至少在我的系统的速度提高了10倍。



Edit2 :跳过排序,以防三种情况使其达到140ms。所以600ms用于比较和排序。



Edit3 :消除递归也有利于树的深度影响较小性能。将TestTree更改为2/2/20(N / L / D)产生大约相同数量的叶子(2m),但是递归(> 70k)的效果差得多,但不是很慢(2500从1200),没有。 p>

For a tree structure as follows

public class Node implements Comparable<Node> {
    private List<Node> nodes=new ArrayList<Node>();
    private String name="";
    private List<String> leaves=new ArrayList<String>();
    private Node parent=null;

    public List<Node> getNodes() {
        return nodes;
    }

    public void setNodes(List<Node> nodes) {
        this.nodes = nodes;
    }

    public List<String> getLeaves() {
        return leaves;
    }

    public void setLeaves(List<String> leaves) {
        this.leaves = leaves;
    }

    @Override
    public int compareTo(Node o) {
        return this.getName().compareTo(o.getName());
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Node getParent() {
        return parent;
    }

    public void setParent(Node parent) {
        this.parent = parent;
    }

    public int getDepth() {
        int depth = 0;
        Node parent = this.getParent();
        while (parent != null) {
            depth++;
            parent = parent.getParent();
        }
        return depth;
    }
}

From a node, I wish to have a method that returns all the distinct direct and indirect leaves (In the above case the strings leaves would be the leaves), for that node in sorted order.

Above is a highly torn down data structure to easy testing and demonstration. I have tried the following 3 approaches,

Approach A Very slow when depth is large ~20, since the deepest leaves are traversed to several times, once for each of its ancestor, hence same path is traversed multiple times.

    public List<String> getLeavesDeep1() {
        Set<String> leaves = new TreeSet<String>();
        leaves.addAll(getLeaves());
        for (Node node : getNodes()) {
            leaves.addAll(node.getLeavesDeep1());
        }
        return new ArrayList<String>(leaves);
    }

Avg: 12694 ms / Without sort/distinct> Avg: 471 ms

Approach B Little faster than A, as the number of nodes is comparatively very less than leaves, so using the approach A but for nodes, and then for each of the nodes, getting direct leaves only.

    private List<Node> getNodesDeep2() {
        Set<Node> nodes = new TreeSet<Node>();
        nodes.addAll(getNodes());
        for (Node node : getNodes()) {
            nodes.addAll(node.getNodesDeep2());
        }
        return new ArrayList<Node>(nodes);
    }

    public List<String> getLeavesDeep2() {
        Set<String> leaves = new TreeSet<String>();
        leaves.addAll(getLeaves());
        for (Node node : getNodesDeep2()) {
            leaves.addAll(node.getLeaves());
        }
        return new ArrayList<String>(leaves);
    }

Avg: 4355 ms / Without sort/distinct> Avg: 2406 ms

Approach C Avoid TreeSet, used ArrayList's and sorted & filtered (not the best way to sort/distinct though) just before returning

    private List<Node> getNodesDeep3() {
        List<Node> nodes = new ArrayList<Node>();
        nodes.addAll(getNodes());
        for (Node node : getNodes()) {
            nodes.addAll(node.getNodesDeep3());
        }
        return new ArrayList<Node>(new TreeSet<Node>(nodes));
    }

    public List<String> getLeavesDeep3() {
        List<String> leaves = new ArrayList<String>();
        leaves.addAll(getLeaves());
        for (Node node : getNodesDeep3()) {
            leaves.addAll(node.getLeaves());
        }
        return new ArrayList<String>(new TreeSet<String>(leaves));
    }

Avg: 4400

Looking for something faster, I know there are certain tree traversals that can be used, but I would prefer something simpler if there exists. P.S. These is no use case for searching at the moment. In my real class the times are much higher approx 3x to the above cases, as the structure is much more complex with the leaves not being simple strings, but POJOs

Following is the test I have used to get the times

private static final int NODES = 5;
private static final int LEAVES = 25;
private static final int DEPTH = 8;

public void addChildren(Node parent) {
    List<Node> nodes = new ArrayList<Node>();
    List<String> leaves = new ArrayList<String>();
    for (int i = 0; i < LEAVES; i++) {
        leaves.add(String.format("%s_leaf_%s", parent.getName(), i));
    }
    for (int i = 0; i < NODES; i++) {
        Node child = new Node();
        child.setParent(parent);
        child.setName(String.format("%s_%s", parent.getName(), i));
        nodes.add(child);
        if (child.getDepth() < DEPTH) {
            addChildren(child);
        }
    }
    parent.setNodes(nodes);
    parent.setLeaves(leaves);
}

@Test
public void testCase() {
    long start, tot=0;
    long t = 0;
    List<String> leaves;
    Node target = new Node();
    target.setName("Root");
    addChildren(target);
    for (int i = 0; i < 10; i++) {
        start = System.currentTimeMillis();
        leaves = target.getLeavesDeep5();
        t = System.currentTimeMillis() - start;
        tot += t;
        System.out.println(leaves.size() + " " + t);
    }

    System.out.println("Avg: " + (tot / 10));
}

Answers in any language are acceptable including pseudo code, as long as it doesn't tightly tie the solution to that language (Exception: Pure java code is barred from the second clause)

解决方案

I ran your test and it gave me the following results (i used your version 3, one slightly modified version3 and a new version)

2441400 8038
...
2441400 7890
Avg: 7872

2441400 4850
...
2441400 3990
Avg: 4165

2441400 980
...
2441400 710
Avg: 786

I first changed

return new ArrayList<String>(new TreeSet<String>(leaves));

to

Collections.sort(leaves);
return leaves;

See Is it faster to add to a collection then sort it, or add to a sorted collection?

Which gave an almost 50% reduction in execution time. Note: The TreeSet will remove duplicates, sort will not.

I then wrote a new Iterator method combining your 2 methods into one and eliminating recursion all together. I also got rid of ArrayLists to avoid the resizing and copying which we don't need because we only iterate and never access by index.

Edit: using ArrayList to store the leaves increases time from 800ms to about 1400ms.

public List<String> getLeavesDeepX()
{
    final Deque<Node> nodes = new LinkedList<Node>();
    final Collection<String> leaves = new LinkedList<String>();
    //final Collection<String> leaves = new LinkedHashSet<String>(); -- use for removing dupes
    nodes.add(this);
    do
    {
        final Node current = nodes.pop();
        leaves.addAll(current.getLeaves());
        nodes.addAll(current.getTreeNodes());
    }
    while(nodes.isEmpty() == false);

    final ArrayList<String> result = new ArrayList<String>(leaves);
    Collections.sort(result);
    return result;
}

I put all results into different lists and compared those at the end.

    System.out.println(Arrays.equals(leaves1.toArray(), leaves2.toArray()));
    System.out.println(Arrays.equals(leaves1.toArray(), leaves3.toArray()));
    System.out.println(Arrays.equals(leaves2.toArray(), leaves3.toArray()));

Output:

true
true
true

So at least on my system its about a 10 fold increase in speed.

Edit2: Skipping the sorting in case 3 brings it to 140ms. So 600ms are used comparing and sorting. Any further major improvement needs to be done there.

Edit3: Eliminating recursion also has the benefit that the depth of the tree has less impact on performance. Changing the TestTree to 2/2/20 (N/L/D) yields about the same number of leaves(2m) but performs much worse with recursion (>70k) but is not much slower (2500 from 1200) without.

这篇关于以排序顺序获取树的所有树叶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆