将数据结构映射到关系数据库中是否有意义? [英] Does it Make Sense to Map a Graph Data-structure into a Relational Database?

查看:172
本文介绍了将数据结构映射到关系数据库中是否有意义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

特别是 Multigraph

一些同事建议我完全感到失望。

Some colleague suggested this and I'm completely baffled.

有关这方面的任何见解吗?

Any insights on this?

推荐答案

数据库中的图表:您有一个节点表和一个边表,它们充当节点表与其本身之间的多对多关系表。像这样:

It's pretty straightforward to store a graph in a database: you have a table for nodes, and a table for edges, which acts as a many-to-many relationship table between the nodes table and itself. Like this:

create table node (
  id integer primary key
);

create table edge (
  start_id integer references node,
  end_id integer references node,
  primary key (start_id, end_id)
);

但是,有一些关于以这种方式存储图表的问题。

However, there are a couple of sticky points about storing a graph this way.

首先,这个方案中的边缘是自然定向的 - 开始和结束是不同的。如果你的边缘是无向的,那么你将不得不在写查询时要小心,或者在表中为每个边存储两个条目,一个在任一方向(然后仔细写查询!如果存储单个边,我建议规范化存储的形式 - 也许总是认为具有最低ID的节点是开始(并添加一个检查约束到表强制执行)。你可以有一个真正无序的表示,没有边缘引用节点,而是在它们之间有一个连接表,但这对我来说似乎不是一个好主意。

Firstly, the edges in this scheme are naturally directed - the start and end are distinct. If your edges are undirected, then you will either have to be careful in writing queries, or store two entries in the table for each edge, one in either direction (and then be careful writing queries!). If you store a single edge, i would suggest normalising the stored form - perhaps always consider the node with the lowest ID to be the start (and add a check constraint to the table to enforce this). You could have a genuinely unordered representation by not having the edges refer to the nodes, but rather having a join table between them, but that doesn't seem like a great idea to me.

其次,上面的模式没有办法表示多图。你可以轻松地扩展它足够这样做;如果给定节点对之间的边界不可区分,则最简单的事情是向每个边缘行添加计数,指示在所引用的节点之间存在多少边。如果它们是可区分的,那么你将需要添加一些东西到节点表以允许它们被区分 - 自动生成的边缘ID可能是最简单的事情。

Secondly, the schema above has no way to represent a multigraph. You can extend it easily enough to do so; if edges between a given pair of nodes are indistinguishable, the simplest thing would be to add a count to each edge row, saying how many edges there are between the referred-to nodes. If they are distinguishable, then you will need to add something to the node table to allow them to be distinguished - an autogenerated edge ID might be the simplest thing.

即使已经整理出存储,你有使用图的问题。如果你想对内存中的对象进行所有的处理,而数据库是纯粹用于存储,那么没有问题。但是如果你想在数据库中的图形上做查询,那么你必须弄清楚如何在SQL中做这些,它没有任何内置的图形支持,并且其基本操作不容易适应使用图形。它可以做到,特别是如果你有一个具有递归SQL支持(PostgreSQL,Firebird,一些专有数据库)的数据库,但它需要一些思考。如果您想这样做,我的建议是发布有关具体查询的进一步问题。

However, even having sorted out the storage, you have the problem of working with the graph. If you want to do all of your processing on objects in memory, and the database is purely for storage, then no problem. But if you want to do queries on the graph in the database, then you'll have to figure out how to do them in SQL, which doesn't have any inbuilt support for graphs, and whose basic operations aren't easily adapted to work with graphs. It can be done, especially if you have a database with recursive SQL support (PostgreSQL, Firebird, some of the proprietary databases), but it takes some thought. If you want to do this, my suggestion would be to post further questions about the specific queries.

这篇关于将数据结构映射到关系数据库中是否有意义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆