将scipy树状图转换为json以进行d3.js树可视化 [英] scipy dendrogram to json for d3.js tree visualisation

查看:110
本文介绍了将scipy树状图转换为json以进行d3.js树可视化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将scipy分层聚类的结果转换为json,以显示在d3.js中,这里一个示例

I am trying to convert results of scipy hierarchical clustering into json for display in d3.js here an example

以下代码生成具有6个分支的树状图.

The following codes produces a dendrogram with 6 branches.

import pandas as pd 
import scipy.spatial
import scipy.cluster

d = {'employee' : ['A', 'B', 'C', 'D', 'E', 'F'],
 'skillX': [2,8,3,6,8,10],
 'skillY': [8,15,6,9,7,10]}

d1 = pd.DataFrame(d)

distMat = xPairWiseDist = scipy.spatial.distance.pdist(np.array(d1[['skillX', 'skillY']]), 'euclidean')
clusters = scipy.cluster.hierarchy.linkage(distMat, method='single')
dendo  = scipy.cluster.hierarchy.dendrogram(clusters, labels = list(d1.employee), orientation = 'right')

dendo

我的问题 如何以d3.js可以理解的格式表示json文件中的数据

my question How can I represent the data in a json file in a format that d3.js understand

{'name': 'Root1’, 
      'children':[{'name' : 'B'},
                  {'name': 'E-D-F-C-A',
                           'children' : [{'name': 'C-A',
                                         'children' : {'name': 'A'}, 
                                                      {'name' : 'C'}]
                                                 }
                   }
                   ]
}

令人尴尬的事实是,我不知道是否可以从

The embarassing truth is that I do not know if I can extract this information from the dendogram or from the linkage matrix and how

我很感谢我能得到的任何帮助.

I am thankful for any help I can get.

编辑以澄清

到目前为止,我已经尝试使用

So far, I have tried to use the totree method but have difficulties understanding its structure (yes, I read the documentation).

a = scipy.cluster.hierarchy.to_tree(clusters , rd=True)

for x in a[1]:
 #print x.get_id()
 if x.is_leaf() != True :
     print  x.get_left().get_id(), x.get_right().get_id(), x.get_count()

推荐答案

您可以通过三个步骤进行操作:

You can do this in three steps:

  1. 递归构造一个嵌套的字典,该字典表示Scipy的 dump 生成的嵌套字典到JSON和加载到d3中.

构建表示树状图的嵌套字典

第一步,重要的是用rd=False调用to_tree,以便返回树状图的根.从该根目录,您可以按以下方式构造嵌套字典:

For the first step, it is important to call to_tree with rd=False so that the root of the dendrogram is returned. From that root, you can construct the nested dictionary as follows:

# Create a nested dictionary from the ClusterNode's returned by SciPy
def add_node(node, parent ):
    # First create the new node and append it to its parent's children
    newNode = dict( node_id=node.id, children=[] )
    parent["children"].append( newNode )

    # Recursively add the current node's children
    if node.left: add_node( node.left, newNode )
    if node.right: add_node( node.right, newNode )

T = scipy.cluster.hierarchy.to_tree( clusters , rd=False )
d3Dendro = dict(children=[], name="Root1")
add_node( T, d3Dendro )
# Output: => {'name': 'Root1', 'children': [{'node_id': 10, 'children': [{'node_id': 1, 'children': []}, {'node_id': 9, 'children': [{'node_id': 6, 'children': [{'node_id': 0, 'children': []}, {'node_id': 2, 'children': []}]}, {'node_id': 8, 'children': [{'node_id': 5, 'children': []}, {'node_id': 7, 'children': [{'node_id': 3, 'children': []}, {'node_id': 4, 'children': []}]}]}]}]}]}

基本思想是从一个不在树状图中的节点开始,该节点将作为整个树状图的根.然后,我们递归地将左儿童和右儿童添加到该字典中,直到到达叶子为止.此时,我们还没有节点的标签,所以我只是用clusterNode ID标记节点.

The basic idea is to start with a node not in the dendrogram that will serve as the root of the whole dendrogram. Then we recursively add left- and right-children to this dictionary until we reach the leaves. At this point, we do not have labels for the nodes, so I'm just labeling nodes by their clusterNode ID.

标记树状图

接下来,我们需要使用node_ids来标记树状图.这些注释应该足以说明其工作原理.

Next, we need to use the node_ids to label the dendrogram. The comments should be enough explanation for how this works.

# Label each node with the names of each leaf in its subtree
def label_tree( n ):
    # If the node is a leaf, then we have its name
    if len(n["children"]) == 0:
        leafNames = [ id2name[n["node_id"]] ]

    # If not, flatten all the leaves in the node's subtree
    else:
        leafNames = reduce(lambda ls, c: ls + label_tree(c), n["children"], [])

    # Delete the node id since we don't need it anymore and
    # it makes for cleaner JSON
    del n["node_id"]

    # Labeling convention: "-"-separated leaf names
    n["name"] = name = "-".join(sorted(map(str, leafNames)))

    return leafNames

label_tree( d3Dendro["children"][0] )

转储为JSON并加载到D3

最后,在树状图被标记后,我们只需要将其输出到JSON并加载到D3中即可.我只是粘贴Python代码以将其转储到JSON,以确保完整性.

Finally, after the dendrogram has been labeled, we just need to output it to JSON and load into D3. I'm just pasting the Python code to dump it to JSON here for completeness.

# Output to JSON
json.dump(d3Dendro, open("d3-dendrogram.json", "w"), sort_keys=True, indent=4)

输出

我在下面创建了树状图的Scipy和D3版本.对于D3版本,我只需将输出('d3-dendrogram.json')的JSON文件插入此要点

I created Scipy and D3 versions of the dendrogram below. For the D3 version, I simply plugged the JSON file I output ('d3-dendrogram.json') into this Gist.

SciPy树状图

D3树状图

这篇关于将scipy树状图转换为json以进行d3.js树可视化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆