使用NodeJS从csv文件映射数组以进行树层次结构可视化 [英] Mapping arrays from csv file for tree hierarchy visualization using NodeJS

查看:94
本文介绍了使用NodeJS从csv文件映射数组以进行树层次结构可视化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图以可视化方式显示SQL表之间的关系.我在csv工作表中有三列(列:目标,源,JoinSource).

目标列在每个单元格中都有一个表名,例如A1,A2,A3.......

来源列具有包含多个元素的数组.元素具有索引作为前缀.一个示例数组如下所示:(我将实际数据更改为虚拟数据,并且所有这些元素实际上都是SQL表)

[P1 Apple, P2 Mango, P2.1 Pluto, P3.1.1 Earth... P10 Red, P10.1 Blue, P10.1.1 Copper]

JoinSource 的结构与 Source 相似,但元素不同.来自 JoinSource 的示例数组希望:

[P3 Orange, P2.2 Charlie, P1.1 Mushroom, P7 Cyclone, P7.1 Hurricane.... P10.2 Typhoon]

每个表都有一个字母数字前缀.前缀 P 只是用于简化目的的任意变量,因此我们可以放心地忽略它.

数字前缀 1 2 2.1 10.1.1 表示表之间的关系.如果是整数,则直接连接到目标列中的表格.如果有一个小数,则它直接连接到 Source JoinSource 中的表.

简单地说, A1 是父表- P1 Apple A1 的子表;而 P1.1蘑菇 P1苹果的子代.

类似地, P10红色 A1 的子级; P10.1蓝色 P10红色的子级; P10.1.1铜 P10.1蓝色的子级.

父/子关系取决于索引中的小数位数.如果没有小数,则直接输入.如果有一个十进制数,则它以与前缀相同的整数连接到表;如果有两个小数,则它以相同的整数加一个小数作为前缀连接到表.


我希望上面的解释清楚.现在,我需要在NodeJS中使用一些逻辑(for循环,if循环等)并进行表的父子映射.非常感谢您的帮助.


csv工作表中的数据如下所示.

|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
|    Target  |                           Source                                 |           JoinSource                                                                                                |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
| Fenugreek  | P8 Sirocco, P8.1 Merlin, P9.1 Cancun, P10.1 Force, P11.2 Torque  | P1 Tiger, P2 Lion, P3 Train, P4 Giraffe, P5 Bear, P6 Javelin, P7 Mingo, P8 Mavue, P9 Violet, P10 Jupiter, P11 Pluto |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| 
| Chernobyl  | P1 Moro, P2 Cher, P2.1 Rona, P2.2 Mason, P3 Tonga, P4.1 Nagatom  | P1.1 Eba, P2.3 Van, P3.1 Gomin, P4 Evaum, P4.2 Hun                                                                  |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|


上表要注意的一件事是,第一行中有一个P1,第二行中还有一个P1.这两个是不同的.每行彼此独立,并且每行的可视化也不同.

我需要可视化中的表名,而不是索引.索引仅用于映射目的.例如,树状图中的节点应该更像Apple,Pluto,Earth等,而不是P1,P2.1,P3.1.1.


最终的可视化输出应为类似于.

解决方案

注意:您的原始样本数据在树层次结构中缺少节点,我已经手动填写了.

我已经将数据转换为{id, parentId, name}形式,其中P3.1.1将发出{id: "P3.1.1", parentId: "P3.1", name: "Earth"}.可以将其输入到d3.stratify中,该文件将为您构建层次结构.我也是根节点的{id: "P", name: "Target"}.
d3-分层:
https://github.com/d3/d3-hierarchy/blob/master/README.md#stratify
演示

我将此演示用于层次结构,并进行了一些调整,以构建SVG树: d3-hierarchy(演示)
label更改为默认提取name属性而不是id,并且我将尺寸直接嵌入到函数中.

它使用 d3整洁树(演示)呈现d3层次结构.

 data = `[P1 Apple, P2 Mango, P2.1 Pluto, P3.1.1 Earth, P10 Red, P10.1 Blue, P10.1.1 Copper, P3 PewPewPew, P3.1 Chopper, P3.2 Twenty, P3.1.2 Two]`
.slice(1,-1).split(', ')
.map(x=>x.match(/^((.*?)(?:\.)?(?:\d*)?) (.*)$/).slice(1))
.map(([id, parentId, name])=>({id, parentId, name}))

data.push({id: 'P', name:'Target'})

document.body.appendChild(graph(d3.stratify()(data)))

function graph(root, {
  label = d => d.data.name, 
  highlight = () => false,
  marginLeft = 40
} = {}) {
width=500;
dx=12;
dy=120;
treeLink = d3.linkHorizontal().x(d => d.y).y(d => d.x);
tree = d3.tree().nodeSize([dx, dy]);
  root = tree(root);

  let x0 = Infinity;
  let x1 = -x0;
  root.each(d => {
    if (d.x > x1) x1 = d.x;
    if (d.x < x0) x0 = d.x;
  });

  const svg = d3.create("svg")
      .attr("viewBox", [0, 0, width, x1 - x0 + dx * 2])
      .style("overflow", "visible");
  
  const g = svg.append("g")
      .attr("font-family", "sans-serif")
      .attr("font-size", 10)
      .attr("transform", `translate(${marginLeft},${dx - x0})`);
    
  const link = g.append("g")
    .attr("fill", "none")
    .attr("stroke", "#555")
    .attr("stroke-opacity", 0.4)
    .attr("stroke-width", 1.5)
  .selectAll("path")
    .data(root.links())
    .join("path")
      .attr("stroke", d => highlight(d.source) && highlight(d.target) ? "red" : null)
      .attr("stroke-opacity", d => highlight(d.source) && highlight(d.target) ? 1 : null)
      .attr("d", treeLink);
  
  const node = g.append("g")
      .attr("stroke-linejoin", "round")
      .attr("stroke-width", 3)
    .selectAll("g")
    .data(root.descendants())
    .join("g")
      .attr("transform", d => `translate(${d.y},${d.x})`);

  node.append("circle")
      .attr("fill", d => highlight(d) ? "red" : d.children ? "#555" : "#999")
      .attr("r", 2.5);

  node.append("text")
      .attr("fill", d => highlight(d) ? "red" : null)
      .attr("dy", "0.31em")
      .attr("x", d => d.children ? -6 : 6)
      .attr("text-anchor", d => d.children ? "end" : "start")
      .text(label)
    .clone(true).lower()
      .attr("stroke", "white");
  
  return svg.node();
} 

 <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/5.16.0/d3.js" integrity="sha256-LHLWSn9RC2p119R1eT2pO3Om+Ir2G0kTZOJmWQ2//pw=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3-array/1.2.2/d3-array.js" integrity="sha256-flJtpBHeLvoTQmeFnm0UuGrCFMGQbK6yrLhaNHyX8kk=" crossorigin="anonymous"></script> 

I'm trying to show the relationship between SQL tables in visualization. I have three columns in a csv sheet (columns: Target, Source, JoinSource).

Column Target has a table name in each cell, say A1, A2, A3....... An.

Column Source has arrays with multiple elements. The elements have an index as a prefix. A sample array would look like: (I've changed the actual data to dummy data and all these elements are actually SQL tables)

[P1 Apple, P2 Mango, P2.1 Pluto, P3.1.1 Earth... P10 Red, P10.1 Blue, P10.1.1 Copper]

The structure of Column JoinSource is similar to Source but with different elements. A sample array from JoinSource would like:

[P3 Orange, P2.2 Charlie, P1.1 Mushroom, P7 Cyclone, P7.1 Hurricane.... P10.2 Typhoon]

Every table has a alphanumerical prefix. The prefix P is just an arbitrary variable used for simplicity purpose, so we can safely ignore it.

The numerical prefixes 1, 2, 2.1, 10.1.1 denote the relationship between tables. If it's a whole number then it is directly connected to the table in column Target. If there's a decimal then it is directly connected to the table either in Source or JoinSource.

To put simply, A1 is the parent table - P1 Apple is the child of A1; and P1.1 Mushroom is the child of P1 Apple.

Similarly, P10 Red is the child of A1; P10.1 Blue is the child of P10 Red; P10.1.1 Copper is the child of P10.1 Blue.

The parent/child relationship depends on the number of decimal places in the index. If there's no decimal it's straight forward. If there's one decimal then it's connected to the table with the same whole number as prefix; if there are two decimals then it's connected to the table with same whole number plus one decimal as prefix.


I hope the above explanation is clear. Now I need to use some logic in NodeJS (for loop, if loop etc) and make the parent-child mapping of tables. Any help is much appreciated.


The data from csv sheet would look like this.

|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
|    Target  |                           Source                                 |           JoinSource                                                                                                |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
| Fenugreek  | P8 Sirocco, P8.1 Merlin, P9.1 Cancun, P10.1 Force, P11.2 Torque  | P1 Tiger, P2 Lion, P3 Train, P4 Giraffe, P5 Bear, P6 Javelin, P7 Mingo, P8 Mavue, P9 Violet, P10 Jupiter, P11 Pluto |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| 
| Chernobyl  | P1 Moro, P2 Cher, P2.1 Rona, P2.2 Mason, P3 Tonga, P4.1 Nagatom  | P1.1 Eba, P2.3 Van, P3.1 Gomin, P4 Evaum, P4.2 Hun                                                                  |
|------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|


One thing to note from the above table is, there is a P1 in the first row and one more P1 in the second row. These two are different. Each row is independent of each other and the visualization is different for each row as well.

And I need the table names in the visualization, not the indexes. The indexes are only for mapping purpose. For instance, nodes in the tree diagram should be more like Apple, Pluto, Earth etc, not P1, P2.1, P3.1.1.


The final visualization output should be something like this.

解决方案

Note: your original sample data had missing nodes in the tree hierarchy, which I've filled in manually.

I've transformed the data into the form {id, parentId, name} where P3.1.1 will emit {id: "P3.1.1", parentId: "P3.1", name: "Earth"}. This can be fed to d3.stratify that will build the hierarchy for you. I also a {id: "P", name: "Target"} for the root node.
d3-stratify:
https://github.com/d3/d3-hierarchy/blob/master/README.md#stratify
demo

I used this demo for hierarchy with a few adjustments to build the SVG tree: d3-hierarchy (demo)
label was changed to extract name property by default instead of id, and I've embedded the dimensions directly into the function.

It uses d3 Tidy Tree (demo) to render a d3-hierarchy.

data = `[P1 Apple, P2 Mango, P2.1 Pluto, P3.1.1 Earth, P10 Red, P10.1 Blue, P10.1.1 Copper, P3 PewPewPew, P3.1 Chopper, P3.2 Twenty, P3.1.2 Two]`
.slice(1,-1).split(', ')
.map(x=>x.match(/^((.*?)(?:\.)?(?:\d*)?) (.*)$/).slice(1))
.map(([id, parentId, name])=>({id, parentId, name}))

data.push({id: 'P', name:'Target'})

document.body.appendChild(graph(d3.stratify()(data)))

function graph(root, {
  label = d => d.data.name, 
  highlight = () => false,
  marginLeft = 40
} = {}) {
width=500;
dx=12;
dy=120;
treeLink = d3.linkHorizontal().x(d => d.y).y(d => d.x);
tree = d3.tree().nodeSize([dx, dy]);
  root = tree(root);

  let x0 = Infinity;
  let x1 = -x0;
  root.each(d => {
    if (d.x > x1) x1 = d.x;
    if (d.x < x0) x0 = d.x;
  });

  const svg = d3.create("svg")
      .attr("viewBox", [0, 0, width, x1 - x0 + dx * 2])
      .style("overflow", "visible");
  
  const g = svg.append("g")
      .attr("font-family", "sans-serif")
      .attr("font-size", 10)
      .attr("transform", `translate(${marginLeft},${dx - x0})`);
    
  const link = g.append("g")
    .attr("fill", "none")
    .attr("stroke", "#555")
    .attr("stroke-opacity", 0.4)
    .attr("stroke-width", 1.5)
  .selectAll("path")
    .data(root.links())
    .join("path")
      .attr("stroke", d => highlight(d.source) && highlight(d.target) ? "red" : null)
      .attr("stroke-opacity", d => highlight(d.source) && highlight(d.target) ? 1 : null)
      .attr("d", treeLink);
  
  const node = g.append("g")
      .attr("stroke-linejoin", "round")
      .attr("stroke-width", 3)
    .selectAll("g")
    .data(root.descendants())
    .join("g")
      .attr("transform", d => `translate(${d.y},${d.x})`);

  node.append("circle")
      .attr("fill", d => highlight(d) ? "red" : d.children ? "#555" : "#999")
      .attr("r", 2.5);

  node.append("text")
      .attr("fill", d => highlight(d) ? "red" : null)
      .attr("dy", "0.31em")
      .attr("x", d => d.children ? -6 : 6)
      .attr("text-anchor", d => d.children ? "end" : "start")
      .text(label)
    .clone(true).lower()
      .attr("stroke", "white");
  
  return svg.node();
}

<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/5.16.0/d3.js" integrity="sha256-LHLWSn9RC2p119R1eT2pO3Om+Ir2G0kTZOJmWQ2//pw=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3-array/1.2.2/d3-array.js" integrity="sha256-flJtpBHeLvoTQmeFnm0UuGrCFMGQbK6yrLhaNHyX8kk=" crossorigin="anonymous"></script>

这篇关于使用NodeJS从csv文件映射数组以进行树层次结构可视化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆