在关系数据库中存储分层数据的选项是什么? [英] What are the options for storing hierarchical data in a relational database?
问题描述
良好的概述
一般来说,您要在快速读取时间(例如嵌套集) (邻接表)。通常你最终得到最适合你的需要的下面的选项的组合。以下提供了一些深入的阅读:
- 一次嵌套间隔与邻接列表比较:最佳比较列表,实现路径,嵌套集和嵌套间隔。
- 分层数据模型:幻灯片,其中包含良好的权衡解释和示例用法
-
http://troels.arvin.dk/db/rdbms/links/#hierarchical =noreferrer> RDBMS中的分层数据:我已经看到的最全面和组织良好的链接集,但不是 - 邻接列表:
- http://en.wikipedia.org/wiki/Nested_set_model =noreferrer>嵌套集(aka修改预定树遍历)
- Joe Celko在许多文章和他的书 SQL中的树和层次结构
- 栏位:左,右
- 低级别,祖先,后代
- ,删除更昂贵。
- 需要特定的排序顺序(例如创建)。因此,以不同的顺序排序所有后代需要额外的工作。
- 嵌套间隔
- 与嵌套集类似,但使用实数/浮点数
- 必须处理真实/浮动/十进制表示问题
- 更多复杂的矩阵编码变体增加了祖先编码的好处,如物化路径免费
- 桥接表(aka Closure Table :关于如何使用触发器维护此方法的一些好的想法)
- 列:ancestor,
-
- 与其描述的表格不同。
- 可以在多个层次结构中包含一些节点。
- 对于层次结构的完整知识需要与另一个选项结合使用。
- 平板
- 修改邻接列表,添加级别和排名(例如
- 高价移动和删除
- 低价祖先和后代 :threaded discussion - forums / blog comments
p>我知道和一般功能:
- 列:沿袭(例如/ parent / child / grandchild / etc ...) $ b $
LEFT(lineage,#)='/ enumerated / path'
)- 列:每个谱系级别一个,
- 限制层次结构的深度。
- 低价格
- 昂贵的插入,删除,移动内部节点
数据库专用说明
MySQL
- 使用 CONNECT BY 遍历邻接列表
PostgreSQL
- ltree数据类型实现路径
SQL Server
- 总摘要
- 2008年提供 HierarchyId 数据类型有助于Lineage Column方法,并扩展可以表示的深度。
这是一个问题,即使所有三大供应商都实现了递归 WITH
我建议不同的读者对不同的答案感到满意。
- Troels Arvin的参考文献的完整列表。
- 由于缺乏竞争,Joe Celko介绍的教科书Trees and Hierarchies in SQL for Smarties确实可以被认为是经典。
- 检查各种树编码,重点是嵌套间隔。
Good Overviews
Generally speaking you're making a decision between fast read times (for example, nested set) or fast write times (adjacency list). Usually you end up with a combination of the options below that best fit your needs. The following provides some in depth reading:
- One more Nested Intervals vs. Adjacency List comparison: the best comparison of Adjacency List, Materialized Path, Nested Set and Nested Interval I've found.
- Models for hierarchical data: slides with good explanations of tradeoffs and example usage
- Representing hierarchies in MySQL: very good overview of Nested Set in particular
- Hierarchical data in RDBMSs: most comprehensive and well organized set of links I've seen, but not much in the way on explanation
Options
Ones I am aware of and general features:
- Adjacency List:
- Columns: ID, ParentID
- Easy to implement.
- Cheap node moves, inserts, and deletes.
- Expensive to find level (can store as a computed column), ancestry & descendants (Bridge Table combined with level column can solve), path (Lineage Column can solve).
- Use Common Table Expressions in those databases that support them to traverse.
- Nested Set (a.k.a Modified Preorder Tree Traversal)
- Popularized by Joe Celko in numerous articles and his book Trees and Hierarchies in SQL for Smarties
- Columns: Left, Right
- Cheap level, ancestry, descendants
- Volatile encoding - moves, inserts, deletes more expensive.
- Requires a specific sort order (e.g. created). So sorting all descendants in a different order requires additional work.
- Nested Intervals
- Like nested set, but with real/float/decimal so that the encoding isn't volatile (inexpensive move/insert/delete)
- Have to deal with real/float/decimal representation issues
- A more complex matrix encoding variant adds the benefit of ancestor encoding, like materialized path for "free"
- Bridge Table (a.k.a. Closure Table: some good ideas about how to use triggers for maintaining this approach)
- Columns: ancestor, descendant
- Stands apart from table it describes.
- Can include some nodes in more than one hierarchy.
- Cheap ancestry and descendants (albeit not in what order)
- For complete knowledge of a hierarchy needs to be combined with another option.
- Flat Table
- A modification of the Adjacency List that adds a Level and Rank (e.g. ordering) column to each record.
- Expensive move and delete
- Cheap ancestry and descendants
- Good Use: threaded discussion - forums / blog comments
- Lineage Column (a.k.a. Materialized Path, Path Enumeration)
- Column: lineage (e.g. /parent/child/grandchild/etc...)
- Limit to how deep the hierarchy can be.
- Descendants cheap (e.g.
LEFT(lineage, #) = '/enumerated/path'
) - Ancestry tricky (database specific queries)
- Multiple lineage columns
- Columns: one for each lineage level, refers to all the parents up to the root, levels down from the items level are set to NULL
- Limit to how deep the hierarchy can be
- Cheap ancestors, descendants, level
- Cheap insert, delete, move of the leaves
- Expensive insert, delete, move of the internal nodes
Database Specific Notes
MySQL
Oracle
- Use CONNECT BY to traverse Adjacency Lists
PostgreSQL
- ltree datatype for Materialized Path
SQL Server
- General summary
- 2008 offers HierarchyId data type appears to help with Lineage Column approach and expand the depth that can be represented.
This is kind of a question that is still interesting even after all big three vendors implemented a recursive WITH
clause. I'd suggest that different readers would be pleased with different answers.
- Comprehensive list of references by Troels Arvin.
- For the lack of competition, introductory textbook by Joe Celko "Trees and Hierarchies in SQL for Smarties" can indeed be considered a classic.
- Review of various tree encodings with emphasis to nested intervals.
这篇关于在关系数据库中存储分层数据的选项是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!