这种情况下最好的数据库结构是什么? [英] What is the best database structure for this scenario?

查看:177
本文介绍了这种情况下最好的数据库结构是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据库,持有房地产MLS(多重上市服务)数据。目前,我有一个表,其中包含所有的列表属性(价格,地址,平方英尺等)。有几种不同的属性类型(住宅,商业,租赁,收入,土地等),每种属性类型共享大部分属性,但有一些属性类型是唯一的。

I have a database that is holding real estate MLS (Multiple Listing Service) data. Currently, I have a single table that holds all the listing attributes (price, address, sqft, etc.). There are several different property types (residential, commercial, rental, income, land, etc.) and each property type share a majority of the attributes, but there are a few that are unique to that property type.

我的问题是共享属性超过250个字段,这似乎在一个表中有太多的字段。我的想法是我可以把它们分解成一个EAV(Entity-Attribute-Value)格式,但我已经读了很多不好的事情,这将使运行查询一个真正的痛苦,任何250字段可以搜索。如果我要走这条路线,我真的必须从EAV表中拉出所有数据,按列表id分组,在应用程序端合并,然后对内存对象集合运行我的查询。这也似乎不是很有效率。

My question is the shared attributes are in excess of 250 fields and this seems like too many fields to have in a single table. My thought is I could break them out into an EAV (Entity-Attribute-Value) format, but I've read many bad things about that and it would make running queries a real pain as any of the 250 fields could be searched on. If I were to go that route, I'd literally have to pull all the data out of the EAV table, grouped by listing id, merge it on the application side, then run my query against the in memory object collection. This also does not seem very efficient.

我正在寻找一些想法或建议,进行哪种方式。也许250+字段表是唯一的方式继续。

I am looking for some ideas or recommendations on which way to proceed. Perhaps the 250+ field table is the only way to proceed.

就像一个注意,我使用SQL Server 2012,.NET 4.5 w / Entity Framework 5, C#和数据通过WCF服务传递给asp.net web应用程序。

Just as a note, I'm using SQL Server 2012, .NET 4.5 w/ Entity Framework 5, C# and data is passed to asp.net web application via WCF service.

提前感谢。

推荐答案

让我们考虑替代方案的利弊:

Lets consider the pros and cons of the alternatives:

/ p>

One table for all listings + attributes:


  1. 很宽的表格 - 很难查看模型&模式定义和表数据

  2. 一个查询,无需检索列表中的所有数据。

  3. 需要为每个新的模式更改模式

  4. 如果您始终加载所有属性并且大多数项目具有大多数属性的值,则效率很高。

  5. 根据属性示例LINQ查询:

  1. Very wide table - hard to view to model & schema definitions and table data
  2. One query with no joins required to retreive all data on listing(s)
  3. Requires schema + model change for each new attribute.
  4. Efficient if you always load all the attributes and most items have values for most of the attributes.
  5. Example LINQ query according to attributes:



context.Listings.Where(l => l.PricePerMonthInUsd < 10e3 && l.SquareMeters >= 200)
    .ToList();



所有列表有一个表格,类型和一个(列出ID +属性IDS +)值(EAV):


  1. >
  2. 如果数据非常稀疏(大多数属性没有大多数项目的值),效率很高

  3. 需要从值中提取所有数据 -

  4. 不需要为新属性更改架构和模型

  5. li>如果您希望通过代码类型安全地访问属性,则需要根据属性类型表生成自定义代码
  6. 根据属性示例LINQ查询:

  1. Listing table is narrow
  2. Efficient if data is very sparse (most attributes don't have values for most items)
  3. Requires fetching all data from values - one additional query (or one join, however, that would waste bandwidth - will fetch basic listing table data per attribute value row)
  4. Does not require schema + model changes for new attributes
  5. If you want type safe access to attributes via code, you'll need custom code generation based on attribute types table
  6. Example LINQ query according to attributes:



var listingIds = context.AttributeValues.Where(v =>
                    v.AttributeTypeId == PricePerMonthInUsdId && v < 10e3)
                .Select(v => v.ListingId)
                .Intersection(context.AttributeVales.Where(v =>
                    v.AttributeTypeId == SquareMetersId && v.Value >= 200)
                .Select(v => v.ListingId)).ToList();

或:(比较实际数据库上的效果)

or: (compare performance on actual DB)

var listingIds = context.AttributeValues.Where(v =>
                    v.AttributeTypeId == PricePerMonthInUsdId && v < 10e3)
                .Select(v => v.ListingId).ToList();

listingIds = context.AttributeVales.Where(v =>
                listingIds.Contains(v.LisingId)
                && v.AttributeTypeId == SquareMetersId
                && v.Value >= 200)
            .Select(v => v.ListingId).ToList();

,然后:

var listings = context.Listings.Where(l => listingIds.Contains(l.ListingId)).ToList();



妥协选项 - 包含值的每组属性表(假设您可以将属性分为多个组):


  1. 多个介质宽度表>
  2. 如果每组数据稀疏(例如花园相关属性对于没有花园的商家全部为空,则不需要为花园相关表添加一行)

  3. 需要一个具有多个连接的查询(带宽在连接中不会浪费,因为组表是1:0..1与列表表,而不是1:many)

  4. +对新属性的模型更改

  5. 更简单地查看模式/模型 - 如果您可以将属性分为10个组,则会有25个表,其中包含11个列,


  6. LINQ查询位于上述两个示例之间。

  1. Multiple medium width tables
  2. Efficient if data is sparse per group (e.g. garden related attributes are all null for listings without gardens, so you don't add a row to the garden related table for them)
  3. Requires one query with multiple joins (bandwidth not wasted in join, since group tables are 1:0..1 with listing table, not 1:many)
  4. Requires schema + model changes for new attributes
  5. Makes viewing the schema/model simpler - if you can divide attributes to groups of 10, you'll have 25 tables with 11 columns instead of another 250 on the listing table
  6. LINQ query is somewhere between the above two examples.


根据您的具体统计(关于稀疏性)和需求/可维护性计划(例如,添加/更改属性类型的频率?)并决定。


Consider the pros and cons according to your specific statistics (regarding sparseness) and requirements/maintainability plan (e.g. How often are attribute types added/changed?) and decide.

这篇关于这种情况下最好的数据库结构是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆