确定关闭此树的临界值的算法? [英] Algorithm to decide cut-off for collapsing this tree?

查看：112 发布时间：2020/9/21 3:11:58 python statistics cluster-analysis bioinformatics collapse

本文介绍了确定关闭此树的临界值的算法?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一棵 Newick 树，该树是通过比较Position的相似性(欧式距离)构建的推定的DNA调节基序的权重矩阵(PWM或PSSM)，该基序为4-9 bp长的DNA序列.

I have a Newick tree that is built by comparing similarity (euclidean distance) of Position Weight Matrices (PWMs or PSSMs) of putative DNA regulatory motifs that are 4-9 bp long DNA sequences.

树的交互式版本在iTol上(

An interactive version of the tree is up on iTol (here), which you can freely play with - just press "update tree" after setting your parameters:

我的具体目标:如果它们与最近的父进化枝的平均距离小于< X( ETE2 Python包).这在生物学上是有趣的，因为一些基因调节DNA基序可以彼此同源(旁系同源物或直向同源物).可以通过上面链接的iTol GUI(例如，如果选择X = 0.001，则某些图案会折叠成三角形(基序系列).

My specific goal: to collapse the motifs (tips/terminal nodes/leaves) together if their average distances to the nearest parent clade is < X (ETE2 Python package). This is biologically interesting since some of the gene regulatory DNA motifs may be homologous (paralogues or orthologues) with one another. This collapsing can be done via the iTol GUI linked above, e.g. if you choose X = 0.001 then some motifs become collapsed into triangles (motif families).

我的问题:有人可以建议一种算法，该算法可以输出或帮助可视化X的哪个值适合于最大化折叠的基序的生物学或统计学相关性"吗?理想情况下，针对X绘制时，树的某些属性将发生一些明显的阶跃变化，这向算法建议了一个明智的X.为此是否有任何已知的算法/脚本/程序包?也许代码会针对X的值绘制一些统计信息?我尝试绘制X与平均群集大小的关系图( matplotlib )，但是我看不到明显的步长增加" "，以告知我要使用X的哪个值:

My question: Could anybody suggest an algorithm that would either output or help visualise which value of X is appropriate for "maximizing the biological or statistical relevance" of the collapsed motifs? Ideally there would be some obvious step change in some property of the tree when plotted against X which suggests to the algorithm a sensible X. Are there any known algorithms/scripts/packages for this? Perhaps the code will plot some statistic against the value of X? I've tried plotting X vs. mean cluster size (matplotlib) but I don't see an obvious "step increase" to inform me which value of X to use:

我的代码和数据:我的Python脚本的链接位于[here] [8]，我对此进行了评论，它会生成树数据并为您绘制图(使用参数d_from，d_to和d_step探索距离边界X).如果您具有Easy-install和Python，则只需执行以下两个bash命令即可安装ete2:

My code and data: A link to my Python script is [here][8], I have heavily commented it and it will generate the tree data and plot above for you (use the arguments d_from, d_to and d_step to explore the distance cut-offs, X). You will need to install ete2 by simply executing these two bash commands if you have easy-install and Python:

apt-get install python-setuptools python-numpy python-qt4 python-scipy python-mysqldb python-lxml

easy_install -U ete2

确定关闭此树的临界值的算法? [英] Algorithm to decide cut-off for collapsing this tree?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

确定关闭此树的临界值的算法? [英] Algorithm to decide cut-off for collapsing this tree?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭