分层组的数据库模式 [英] Database schema for hierarchical groups

查看:94
本文介绍了分层组的数据库模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个用于作为更大系统基础的组层次结构的数据库设计。



正在使用的数据库是MS SQL 2005.(虽然工作在MS SQL 2000将是一个奖金;一个解决方案需要MS SQL 2008是不幸的是在这个时候是不可行的)。



有不同类型的组,这些需要动态和可由用户在运行时定义。例如,组类型可以是客户,帐户,城市或建筑物,楼层,并且每种类型将具有可由用户定义的一组不同的属性。还将应用业务规则 - 例如,楼层只能包含在建筑物组下面,并且再次,这些在运行时是可定义的。



许多应用程序功能来自运行基于这些组的报告,因此需要一种相对较快的方式来获取某个组(和所有子组)。



使用修改预订树遍历技术有一个好处,它是快速的,但其缺点是相当复杂和脆弱 - 如果外部用户/应用程序修改数据库,有可能完全破坏。我们还实现了一个ORM层,这种方法似乎使大多数ORM库中的关系复杂化。



使用常用表表达式和标准id / parentid组关系似乎是一种强大的方法,可以避免运行多个递归查询。这种方法有什么缺点吗?



至于属性,什么是最好的存储方式?一个长而窄的表,回到组?如果一个公共属性,如名称存储在组表中,而不是属性表(很多时候,名称将是所有需要显示的)?



使用这种方法会有性能问题(让我们假设2000个组的平均值高,平均每个有6个属性,平均10个并发用户,硬件,例如,四核Xeon 2 Ghz,4GB RAM,折扣任何其他过程)?



随意提出一个完全不同的模式概述在这里。我只是想说明我关心的问题。

解决方案

我建议你实际构建最简单的维护方式(标准父/子设置)



您可能会惊讶数据库引擎可以使用正确的索引,特别是如果您的数据集可以适应内存。 / p>

假设每组6个属性,2000组和30个字节/属性,你说的是360KB *期望项目/组 - 图400KB。如果您希望有1000个项目/组,则只查看400MB的数据,这些数据适合内存而没有问题,并且当所有数据都在连接时数据库快速


I'm working on a database design for groups hierarchy used as the foundation of a larger system. Each group can contain other groups, and also 'devices' as leaf objects (nothing goes below device).

The database being used is MS SQL 2005. (Though working in MS SQL 2000 would be a bonus; a solution requiring MS SQL 2008 is unfortunately not feasible at this time).

There are different types of groups, and these need to be dynamic and definable at run-time by users. For example, group types might be "customer", "account", "city", or "building", "floor", and each type is going to have a different set of attributes, definable by the user. There will also be business rules applied - eg, a "floor" can only be contained underneath a "building" group, and again, these are definable at runtime.

A lot of the application functionality comes from running reports based on these groups, so there needs to be a relatively fast way to get a list of all devices contained within a certain group (and all sub-groups).

Storing groups using modified pre-order tree traversal technique has the upside that it is fast, but the downside that it is fairly complex and fragile - if external users/applications modify the database, there is the potential for complete breakage. We're also implementing an ORM layer, and this method seems to complicate using relations in most ORM libraries.

Using common table expressions and a "standard" id/parentid groups relation seem to be a powerful way to avoid running multiple recursive queries. Is there any downside to this method?

As far as attributes, what is the best way to store them? A long, narrow table that relates back to group? Should a common attribute, like "name" be stored in a groups table, instead of the attributes table (a lot of the time, the name will be all that is required to display)?

Are there going to be performance issues using this method (let's assume a high average of 2000 groups with average of 6 attributes each, and average 10 concurrent users, on a reasonable piece of hardware, eg, quad-core Xeon 2 Ghz, 4GB ram, discounting any other processes)?

Feel free to suggest a completely different schema than what I've outlined here. I was just trying to illustrate the issues I'm concerned about.

解决方案

I'd recommend you actually construct the easiest-to-maintain way (the "standard" parent/child setup) and run at least some basic benchmarks on it.

You'd be surprised what a database engine can do with the proper indexing, especially if your dataset can fit into memory.

Assuming 6 attributes per group, 2000 groups, and 30 bytes/attribute, you're talking 360KB*expected items/group -- figure 400KB. If you expect to have 1000 items/group, you're only looking at 400MB of data -- that'll fit in memory without a problem, and databases are fast at joins when all the data is in memory.

这篇关于分层组的数据库模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆