数据库设计:跟踪每个用户的大量属性。如此之多,以至于我可能会用完列(行存储空间) [英] Database Design: track a vast number of attributes for each user. So much so, that I will likely run out of columns (row storage space)

查看:110
本文介绍了数据库设计:跟踪每个用户的大量属性。如此之多,以至于我可能会用完列(行存储空间)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于所关注的问题,我将不胜感激。

I'd appreciate some opinions on a concern I have.

我的数据库中有一个[User]表,其中包含您期望的基本内容,例如用户名,密码等...

I have a [User] table in my database, with the basic stuff you'd expect, like username, password, etc...

此应用程序要求我为每个用户跟踪大量属性。如此之多,我可能会用完列(行存储空间)。

This application requires that I track a vast number of attributes for each user. So much so, that I will likely run out of columns (row storage space).

我很想添加一个带有UserID,PropertyKey和PropertyValue列的UserProperties表。这种方法非常符合要求。

I'm tempted to add a UserProperties table with UserID, PropertyKey and PropertyValue columns. This approach fits well with the requirements.

我担心的是,如果每个用户拥有100个属性,那么当数据库中有100万个用户时,我们将拥有100,000,000个属性属性行。

My concern is that if each user has say 100 properties, when the database has a million users in it, we'll have 100,000,000 property rows.

我认为,使用UserID上的聚簇索引,访问仍然会快速发出尖叫,并且您实际上存储的数据量与

I would think that with a clustered index on the UserID, that access will still be screaming fast, and you are really storing about the same amount of data as you would with the mega-columns approach.

对性能问题有任何想法或想法吗?

Any ideas or thoughts on performance concerns? Ideas for a better DB design?

更新:

我一直在玩弄周围的可能性,一件事一直困扰着我。我需要经常查询其中一些属性,更糟糕的是,这些查询可能涉及查找同时在多达10个这些属性上符合条件的所有用户。

I have been toying around with the possibilities, and one thing keeps bothering me. I need to query on some of these attributes pretty frequently, and worse yet, these queries could involve finding all users who match criteria on as many as 10 of these attributes at the same time.

结果,我现在倾向于采用大列方法,但是可能将数据拆分为一个(或多个)单独的表,从而形成以UserID键为一对一的关系。

As a result, I am now leaning towards the mega-column approach, but possibly splitting the data off into one (or more) separate tables, forming a one-to-one relationship keyed on the UserID.

我正在使用LinqToSql,虽然我认为具有如此多列的表并不雅致,但我认为考虑所有挑战和折衷办法,这也许是正确的选择,但是我仍然渴望听到其他意见。

I'm using LinqToSql, and while I think tables with this many columns are inelegant, I think considering all the challenges and trade-offs, it is probably the right one, but I am still eager to hear other opinions.

推荐答案

您要描述的是一个Entity-Attribute-Value数据库,通常是

What you're describing is an Entity-Attribute-Value database, which is often used for exactly th situation you describe, sparse data tied to a single entity.

EAV表很容易搜索,可用于您所描述的确切情况,将稀疏数据绑定到单个实体。问题不在于寻找行,而在于寻找相关的行。

An E-A-V table is easy to search. The problem isn't finding rows, it's finding related rows.

为不同的实体使用不同的表可以提供域建模,但它们也提供了一种弱形式的元数据。在E-A-V中,没有这样的抽象。 (Java与EAV的比喻是宣称所有函数的形式参数都是Object类型的-因此您将无需进行类型检查。)

Having different tables for different entities provides domain modeling, but they also provide a weak form of metadata. In E-A-V there are no such abstractions. (The Java analogy to E-A-V would be declaring that all functions' formal arguments were of type Object -- so you'd get no type-checking.)

我们可以轻松地

Wikipedia上有一篇关于EAV的很好的文章,但是现在阅读-这主要是一个作者的工作,并计划进行改进。

Wikipedia has a very good article on E-A-V, but read it now -- it's mostly the work of one author, and is slated for "improvement".

这篇关于数据库设计:跟踪每个用户的大量属性。如此之多,以至于我可能会用完列(行存储空间)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆