枚举二叉树的算法改进 [英] Algorithm improvement for enumerating binary trees

查看:106
本文介绍了枚举二叉树的算法改进的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我可以列举已扎根 未标记使用以下蛮力的Prolog代码生成二叉树.

Currently I can enumerate rooted planar unlabeled binary trees using the following brute-force Prolog code.

e --> u | b | t.
u --> ['[op(u),['], e, [']]'].
b --> ['[op(b),['], e, [','], e, [']]'].
t --> ['number(n)'].

注意:请参见下面的输出清单.

Note: See output listings below.

并使用

es(S) :-
    length(Ls, _),
    phrase(e, Ls),     % or e(Ls,[]),
    atomic_list_concat(Ls,S).

但这是效率低下的蛮力算法.

However this is inefficient brute-force algorithm.

是否有一种更有效的算法来枚举有根的平面未标记二叉树?

Is there a more efficient algorithm for enumerating rooted planar unlabeled binary trees?

注意:可以通过使用前两次迭代中的树,考虑斐波那契数并添加一元分支或二进制分支来生成树,但是这会导致树重复.我自己可以做那个版本,我要寻找的是一种算法,该算法可以在第一次没有重复的情况下以高效的方式生成树.

Note: The trees can be generated by using the trees from the previous two iterations, think Fibonacci numbers, and adding either a unary branch or a binary branch, however this leads to duplicate trees. I myself could do that version, what I am looking for is an algorithm that generates the trees in an efficient manner the first time without duplicates.

注意:二叉树也称为二叉表达式树 K进制树,K <= 2.

Note: A binary tree is also known as a binary expression tree or a K-ary tree with K <=2.

我的M(15)暴力破解版本花了1小时27分钟, 而M(15)的高效版本大约需要2秒钟.

My brute-force version for M(15) took 1 hr and 27 minutes, while the efficient version for M(15) took about 2 seconds.

显然,高效的算法是如此,效率更高,而且我为什么要问这个问题.

Obviously the efficient algorithm is just that, much more efficient and why I asked the question.

具有根未标记的平面二叉树的N个节点的树的数目由Motzkin数给出.请参阅:OEIS A001006

The number of trees that have N nodes for a rooted planar unlabeled binary trees is given by Motzkin numbers. See: OEIS A001006

Nodes  Trees
1      1
2      1
3      2
4      4
5      9

加泰罗尼亚语数字给出了具有根的平面未标记二叉树的N个内部节点的树的数量.有一种更有效的算法,可以使用加泰罗尼亚语数字生成有根的平面二叉树.

The number of trees that have N internal nodes for a rooted planar unlabeled binary tree is given by Catalan numbers. There is a more efficient algorithm for generating rooted planar binary trees using Catalan numbers.

注意:
基于加泰罗尼亚语数字的树的数量不具有一元分支,并且仅计数内部个节点.

Note:
The number of trees based on Catalan numbers do not have unary branches and count only internal nodes.

同时

基于Motzkin数的树数具有一元分支,并计数所有个节点.

the number of trees based on Motzkin numbers do have unary branches and count all nodes.

请参阅:
OEIS A000108
汤姆·戴维斯(Tom Davis)的加泰罗尼亚语

将Prolog列表元素与Motzkin编号相关

See:
OEIS A000108
Catalan Numbers by Tom Davis

% M is Motzkin number, N is number of  list elements passed to atomic_list_concat\2
m_to_n(1,1).
m_to_n(2,3).
m_to_n(M,N) :-
    M > 2,
    N is (M*2)-1.

es_m(M,S) :-
    m_to_n(M,N),
    length(Ls,N),
    e(Ls,[]),
    atomic_list_concat(Ls,S).

暴力版本无效的统计信息

es_c(M,Count) :-
    aggregate_all(count, es_m(M,_), Count).

?- time(es_c(1,Count)).
% 57 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
Count = 1.

?- time(es_c(2,Count)).
% 141 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
Count = 1.

?- time(es_c(3,Count)).
% 571 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
Count = 2.

?- time(es_c(4,Count)).
% 2,740 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
Count = 4.

?- time(es_c(5,Count)).
% 13,780 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)
Count = 9.

?- time(es_c(6,Count)).
% 70,072 inferences, 0.000 CPU in 0.002 seconds (0% CPU, Infinite Lips)
Count = 21.

?- time(es_c(7,Count)).
% 357,358 inferences, 0.016 CPU in 0.012 seconds (136% CPU, 22870912 Lips)
Count = 51.

?- time(es_c(8,Count)).
% 1,824,082 inferences, 0.063 CPU in 0.056 seconds (111% CPU, 29185312 Lips)
Count = 127.

?- time(es_c(9,Count)).
% 9,313,720 inferences, 0.297 CPU in 0.290 seconds (102% CPU, 31372531 Lips)
Count = 323.

?- time(es_c(10,Count)).
% 47,561,878 inferences, 1.469 CPU in 1.467 seconds (100% CPU, 32382555 Lips)
Count = 835.

?- time(es_c(11,Count)).
% 242,896,160 inferences, 7.672 CPU in 7.665 seconds (100% CPU, 31660599 Lips)
Count = 2188.

?- time(es_c(12,Count)).
% 1,240,493,974 inferences, 38.797 CPU in 38.841 seconds (100% CPU, 31974069 Lips)
Count = 5798.

?- time(es_c(13,Count)).
% 6,335,410,822 inferences, 206.047 CPU in 213.116 seconds (97% CPU, 30747425 Lips)
Count = 15511.

?- time(es_c(14,Count)).
% 32,356,235,848 inferences, 1016.156 CPU in 1018.955 seconds (100% CPU, 31841792 Lips)
Count = 41835.

?- time(es_c(15,Count)).
% 165,250,501,417 inferences, 5231.766 CPU in 5268.363 seconds (99% CPU, 31585991 Lips)
Count = 113634.

参考文献

分析组合" 菲利普·弗拉霍莱特(Philippe Flajolet)和罗伯特·塞奇威克(Robert Sedgewick)

References

A free downloadable book as a pdf that might help is "Analytic Combinatorics" by Philippe Flajolet and Robert Sedgewick

另请参见加泰罗尼亚语标记中的参考.

Also see the references in the Catalan tag.

Motzkin数字

<expression> ::= 
      <unary expression>
    | <binary expression>
    | <terminal>

<unary expression> ::= 
    "(u" <expression> ")"

<binary expression> ::= 
    "(b" <expression> " " <expression> ")"

<terminal> ::= 
    "t"

使用David Eisenstat的答案

把这些当作便条或面包屑,以防万一我忘记了很多个月后又需要再次使用.

Using answer by David Eisenstat

Think of these as notes, or breadcrumbs, for me in case I need to use this again in many months after I forget it.

为了测试答案,我使用了 WSL (适用于Windows子系统Linux),安装了Python 3

To test the answer I used WSL (Windows Subsystem for Linux) with Python 3 installed

使用Windows 10,我在目录

Using Windows 10 I created a file named motzkin.py in the directory

C:\Users\Eric\Documents\Prolog

使用Python代码

def ubtrees(n):
    if n == 1:
        yield 't'
    elif n > 1:
        for t in ubtrees(n - 1):
            yield '(u {})'.format(t)
        for i in range(1, n - 1):
            for t1 in ubtrees(i):
                for t2 in ubtrees(n - 1 - i):
                    yield '(b {} {})'.format(t1, t2)

然后在WSL中,我创建了一个指向Windows Prolog目录的符号链接

then in WSL I created a symbolic link to the Windows Prolog directory

$ ln -s "/mnt/c/Users/Eric/Documents/Prolog" /home/eric/Prolog

并更改为WSL Prolog目录

and changed to the WSL Prolog directory

$ cd Prolog

然后启动Python3

then started Python3

~/Prolog$ python3

并导入Python代码

and imported the Python code

>>> import motzkin

并使用ubtrees作为Motzkin数的参数运行以下命令

and ran the following with the argument to ubtrees being the Motzkin number

>>> for value in ubtrees(1):
...   print(value)
...
t

>>> for value in ubtrees(2):
...   print(value)
...
(u t)

>>> for value in ubtrees(3):
...   print(value)
...
(u (u t))
(b t t)

>>> for value in ubtrees(4):
...   print(value)
...
(u (u (u t)))
(u (b t t))
(b t (u t))
(b (u t) t)

>>> for value in ubtrees(5):
...   print(value)
...
(u (u (u (u t))))
(u (u (b t t)))
(u (b t (u t)))
(u (b (u t) t))
(b t (u (u t)))
(b t (b t t))
(b (u t) (u t))
(b (u (u t)) t)
(b (b t t) t)

并检查Motzkin号码

and to check the Motzkin numbers

def m_count(m):
    count = sum(1 for x in ubtrees(m))
    print("Count: ", count)

>>> m_count(1)
Count:  1
>>> m_count(2)
Count:  1
>>> m_count(3)
Count:  2
>>> m_count(4)
Count:  4
>>> m_count(5)
Count:  9
>>> m_count(6)
Count:  21
>>> m_count(7)
Count:  51
>>> m_count(8)
Count:  127
>>> m_count(9)
Count:  323
>>> m_count(10)
Count:  835
>>> m_count(11)
Count:  2188
>>> m_count(12)
Count:  5798
>>> m_count(13)
Count:  15511
>>> m_count(14)
Count:  41835
>>> m_count(15)
Count:  113634

要退出交互式Python,请使用

To exit interactive Python use

quit()

重复说明

我了解Motzkin数的方法是用笔和纸手工枚举树木,并使用向先前的树木M(N-1)添加一元分支和向先前的树木添加二进制分支的方法找到重复的树木M(N-2)棵树.

Notes on duplicates

The way I learned about Motzkin numbers was by hand enumerating the trees with pen and paper and finding a duplicate by using the method of adding a unary branch to the preceding trees M(N-1) and binary branches to the preceding preceding M(N-2) trees.

这棵树是从M(4)个树中为M(5)生成两次的.

This one tree was generated twice for M(5) from M(4) trees

(b (u t) (u t))

第一个,向其中添加一元分支

the first by adding a unary branch to

(b (u t) t)

第二步,向其中添加一元分支

and second by adding a unary branch to

(b t (u t))

这样做之后,我得到了与1,2,4,9,21一起使用的数字序列,该序列与

After doing this I had the sequence of numbers 1,2,4,9,21 which I used with a OEIS search and the top result was A001006 for Motzkin numbers. Once I had the larger list of Motzkin numbers I used the Prolog code to generate the counts for the larger input values and they all agreed. Now you can add OEIS to your programming tool box with a valid example to demonstrate to others.

如果您已读过本文,那么您可能会发现这是一个更大的问题的一部分,该问题首先在Prolog中构建,该系统可以使用术语重写来解决数学表达式直至基本演算,但更重要的是显示所采取的步骤.因此,这成为生成用作测试用例的二进制表达式树的方式的一部分.下一步是能够分别设置一元和二元节点的数量,而不是通过Motzkin数来固定它们.我仅使用Motzkin编号来验证我是否正确生成了组合的子集.现在我有了模式,我可以对其进行修改,以接受一个参数表示一元节点数,另一个参数表示二进制节点数.请参阅:如何使用具有CLP(FD)和多个约束的DCG枚举组合

If you have read this far then you probably see that this is part of a bigger problem which is building a system first in Prolog that can use term rewriting to solve math expressions up to basic calculus but more importantly show the steps taken. So this gets one part of the way toward generating binary expression trees to be used as test cases. The next step is to be able to set individually the number of nodes that are unary and binary instead of having them fixed by the Motzkin number. I only used the Motzkin numbers to verify that I was generating a subset of the combinations correctly. Now that I have the pattern for that I can modify it to accept one argument for the number of unary nodes and one for the binary nodes. See: How to enumerate combinations using DCGs with CLP(FD) and multiple constraints

只有当我被卡住时,我才会问与此有关的问题,所以不要指望看到所有必要的面包屑.

Only when I get stuck will I ask a questions related to this, so don't expect to see all of the necessary bread crumbs.

?- length(Ls, N), phrase(e, Ls).
Ls = ['number(0)'],
N = 1 ;
Ls = ['[op(u),[', 'number(0)', ']]'],
N = 3 ;
Ls = ['[op(u),[', '[op(u),[', 'number(0)', ']]', ']]'],
N = 5 ;
Ls = ['[op(b),[', 'number(0)', ',', 'number(0)', ']]'],
N = 5 ;
Ls = ['[op(u),[', '[op(u),[', '[op(u),[', 'number(0)', ']]', ']]', ']]'],
N = 7 ;
Ls = ['[op(u),[', '[op(b),[', 'number(0)', ',', 'number(0)', ']]', ']]'],
N = 7 ;
Ls = ['[op(b),[', '[op(u),[', 'number(0)', ']]', ',', 'number(0)', ']]'],
N = 7 ;
Ls = ['[op(b),[', 'number(0)', ',', '[op(u),[', 'number(0)', ']]', ']]'],
N = 7 ;


?- es(S).
S = 'number(0)' ;
S = '[op(u),[number(0)]]' ;
S = '[op(u),[[op(u),[number(0)]]]]' ;
S = '[op(b),[number(0),number(0)]]' ;
S = '[op(u),[[op(u),[[op(u),[number(0)]]]]]]' ;
S = '[op(u),[[op(b),[number(0),number(0)]]]]' ;
S = '[op(b),[[op(u),[number(0)]],number(0)]]' ;
S = '[op(b),[number(0),[op(u),[number(0)]]]]' ;
S = '[op(u),[[op(u),[[op(u),[[op(u),[number(0)]]]]]]]]' ;
S = '[op(u),[[op(u),[[op(b),[number(0),number(0)]]]]]]' ;
S = '[op(u),[[op(b),[[op(u),[number(0)]],number(0)]]]]' ;
S = '[op(u),[[op(b),[number(0),[op(u),[number(0)]]]]]]' ;
S = '[op(b),[[op(u),[[op(u),[number(0)]]]],number(0)]]' ;
S = '[op(b),[[op(u),[number(0)]],[op(u),[number(0)]]]]' ;
S = '[op(b),[[op(b),[number(0),number(0)]],number(0)]]' ;
S = '[op(b),[number(0),[op(u),[[op(u),[number(0)]]]]]]' ;
S = '[op(b),[number(0),[op(b),[number(0),number(0)]]]]' ;


?- es_m(1,E).
E = 'number(n)' ;
false.

?- es_m(2,E).
E = '[op(u),[number(n)]]' ;
false.

?- es_m(3,E).
E = '[op(u),[[op(u),[number(n)]]]]' ;
E = '[op(b),[number(n),number(n)]]' ;
false.

?- es_m(4,E).
E = '[op(u),[[op(u),[[op(u),[number(n)]]]]]]' ;
E = '[op(u),[[op(b),[number(n),number(n)]]]]' ;
E = '[op(b),[[op(u),[number(n)]],number(n)]]' ;
E = '[op(b),[number(n),[op(u),[number(n)]]]]' ;
false.

?- es_m(5,E).
E = '[op(u),[[op(u),[[op(u),[[op(u),[number(n)]]]]]]]]' ;
E = '[op(u),[[op(u),[[op(b),[number(n),number(n)]]]]]]' ;
E = '[op(u),[[op(b),[[op(u),[number(n)]],number(n)]]]]' ;
E = '[op(u),[[op(b),[number(n),[op(u),[number(n)]]]]]]' ;
E = '[op(b),[[op(u),[[op(u),[number(n)]]]],number(n)]]' ;
E = '[op(b),[[op(u),[number(n)]],[op(u),[number(n)]]]]' ;
E = '[op(b),[[op(b),[number(n),number(n)]],number(n)]]' ;
E = '[op(b),[number(n),[op(u),[[op(u),[number(n)]]]]]]' ;
E = '[op(b),[number(n),[op(b),[number(n),number(n)]]]]' ;
false.

推荐答案

在Python 3中:

In Python 3:

def ubtrees(n):
    if n == 1:
        yield 't'
    elif n > 1:
        for t in ubtrees(n - 1):
            yield '(u {})'.format(t)
        for i in range(1, n - 1):
            for t1 in ubtrees(i):
                for t2 in ubtrees(n - 1 - i):
                    yield '(b {} {})'.format(t1, t2)

此递归枚举过程与递归相对应

This recursive enumeration procedure corresponds to the recurrence

M_1 = 1
M_n = M_{n-1} + sum_{i=1}^{n-2} M_i M_{n-1-i},

(与 Wikipedia 中给出的重复次数相比,移了一个

which is shifted from the recurrence given in Wikipedia by one.

这篇关于枚举二叉树的算法改进的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆