如何设计数据库来存储属性,通过同义词选择属性 [英] How do I design a database to store properties, selecting attributes by synonyms

查看:143
本文介绍了如何设计数据库来存储属性,通过同义词选择属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设计一个房地产应用的数据库。这证明比我预想的更多的参与(也许我比较复杂的事情)。



这些问题本质上是由于存在:




  • 同义词
    例如,平面,公寓和阁楼可能都是基本相同的,相同类型的财产

  • 属性(不同的属性类型具有不同的属性)
    例如,公寓可以是底层或顶层等



<对于不同的属性类型,我已经结束了一个相当(无意)精细的分类树。树节点是属性类型的实际实例。



我想创建一个数据库,以便我不仅可以查询任何同义词,还可以查询属性。



所以例如,查询(在伪SQL中):



SELECT * from properties where synonym = (底层,花园);



应该返回公寓列表,这些公寓是一楼,有花园。 / p>

有人可以帮助我如何设计数据库模式,以便允许上述类型的查询?



最后但并非最不重要的是,我将使用MySQl或PostgreSQL作为后端数据库,但是如果可能的话,更喜欢这种方法是无关紧要的。

解决方案

我将采用不同的方法来处理您的归因方案。而不是将不同的归因视为同义词,我将它们视为重叠的,或更具体地说,嵌套的属性描述。这将处理您的商业案例,同时承认Mike Sherrill所做的精明观察。



这是一个快速的ERD草图:





由一种非常快速的数据字典的方式:



PROPERTY 是一块房地产。



CATEGORY 是描述属性的集合。这个表的要点比任何其他的更多的是属性的组织者。它可以包括诸如财产类型,所有权结构,浴室数量以及其他可能感兴趣的内容。



ATTRIBUTE 是一种特定的兴趣品质。注意这种实体类型的渐进关系。稍后我会多处理。要点是属性可以是更一般的或更具体的,一些属性可以被看作是其他属性的改进。



描述符是与该特定房地产相关联的PROPERTY和属性的交集。



那么这应该如何帮助?



关键是如何工作。如果您使用嵌套集模型,那么您可以处理更多或更少的具体归因和搜索条件。请考虑一个潜在的CATEGORY及其相关属性的下图:





在此示例中,CATEGORY是属性类型。从图中可以看出,在这个类别中有一个属性的分层细分。图中的每个框都是ATTRIBUTE中的一个记录。包含其他框的框具有子属性。在另一个盒子里面的盒子里有一个FK到他们的包装盒子里,等等。



这样你可以说我想找一个房顶。然后,您可以找到相关的DESCRIPTOR的PROPERTY记录,这些记录指向阁楼属性。那很简单但是,如果你的搜索空白了,那么这个方法的好处就在于你可以放松你的标准:让我们把属性层次结构放到下一个比顶楼更不具体的东西。在我的例子中,这将是高层。现在你再次尝试搜索,你可能会更好运气。



这样一个系统使您能够在放松时按照每个归因类别的要求进行特定的设置其他人足够开始获得搜索匹配。这真的是房地产经纪人的工作是不是吗?帮助客户做出必要的妥协,以找到最符合他们最重要的标准?



处理嵌套集



这种方法的唯一棘手的部分是如何处理嵌套集。有很多方法可以做到这一点,其中许多方法已经在其他地方彻底记录。我自己就像访问号码技术,特别是对于相对静态的数据集。这样可以很容易地为某些特定的ATTRIBUTE或其任何子项找到匹配项,而无需在SQL中执行任何异常操作。



编辑:那么如何这个工作?



OP问你如何处理像卧室数量以及查询的样子?让我们举一个例子来说明:





上面显示了CATEGORY卧室数的嵌套集。我还将访问号码添加到图表中。请注意访问号码的工作方式,特别是注意任何给定属性值的左(绿色)和右(红色)数字包含任何从属属性的左右访问号。例如,2+卧室的左右数字分别为6和15。属于2+卧室的每个属性都有左右的数字落在此范围内。



那么你将如何查询具有给定描述符的属性?假设我们想找到两个或更多卧室的所有物业。这样一个查询的SQL可能看起来像这样:

 从PROPERTY P $ b $中选择P. * 
b内部连接DESCRIPTOR D
on P.id = D.property_id
内部连接ATTRIBUTE A
on D.attribute_id = A.id
其中A.left> =(select来自ATTRIBUTE X
的X.left,其中X.name ='2+卧室')
和A.right <=(从ATTRIBUTE Y
中选择Y.right其中Y.name =' 2+卧室')

请注意,上述查询与您实际使用的有所不同。例如,您可能会使用其int身份密钥而不是其字符串名称查找过滤属性。但是,我以为我会以清晰的方式离开它的主要观点,这是通过查找特定相关属性而不是进行过滤,但是对于属于您的过滤器中的任何相关属性>范围。



如果要对多个属性进行过滤,则只需在where子句中添加更多子条。


I am designing a database for a real estate application. It is proving to be more involved than I had anticipated (maybe I am overcomplicating things).

The problems essentially are due to the presence of:

  • synonyms For example the terms: flat, apartment and penthouse may all refer to essentially, the same type of property
  • attributes (different property types have different attributes) For example an apartment could be a ground floor or top floor etc

I have ended up with a rather (unintentionally) elaborate classification tree for different property types. The tree nodes are the actual instances of property types.

I want to create a database so that I can query using not only any of the synonyms, but also the attributes.

So for example, the query (in pseudo SQL):

SELECT * from properties where synonym="flat" and attribute IN ('ground floor', 'garden');

should return a list of apartments|flats that are either ground floor AND have a garden.

Can someone help me with how to design the database schema so as to allow the kind of querying described above?

Last but not the least, I will be using either MySQl or PostgreSQL as the backend database, but would prefer the approach to be db agnostic - if possible.

解决方案

I would take a different approach to your attribution scheme. Rather than treating different attributions as synonyms, I would treat them as overlapping, or more specifically, nested descriptions of a property. This would handle your business case while at the same time acknowledging the astute observation made by Mike Sherrill.

Here is a quick ERD sketch:

By way of a very fast data dictionary:

PROPERTY is a piece of real estate.

CATEGORY is a collection of descriptive attributes. The point of this table is more as an organizer of attributes than anything else. It could include things like "type of property", "ownership structure", "number of bathrooms", and whatever else might be of interest.

ATTRIBUTE is a specific quality of interest. Note the involuted relationship on this entity type. I'll deal more with that later. The main point is that attributes can be more general or more specific and some attributes can be seen as refinements of other attributes.

DESCRIPTOR is the intersection of a PROPERTY and the ATTRIBUTEs that have been associated with that particular piece of real estate.

So how is this supposed to help?

The key is how attributes work. If you use a nested set model, then you can address more or less specific attribution and searching criteria. Consider the following diagram of one potential CATEGORY with its associated ATTRIBUTEs:

In this example the CATEGORY is "type of property". You can see from the diagram that there is a hierarchical breakdown of attributes in this category. Each box in the diagram is a record in ATTRIBUTE. Boxes that contain other boxes have child attributes. Boxes that are inside another box have an FK to their containing box and so forth.

In this way, you could say "I want to find a property that is a Penthouse". You can then find PROPERTY records with a related DESCRIPTOR that points at the "Penthouse" ATTRIBUTE. That is pretty easy. But what if your search comes up empty?

The advantage of this approach is that you can then loosen your criteria by saying, "let's go up the attribution hierarchy to the next less-specific thing than penthouse". In my example, that would be "Highrise". Now you try your search again and you might have better luck.

A system like this gives you the ability to be as specific as you want in each category of attribution while relaxing the others far enough to start getting search hits. This is really what a real estate agent's work is about is it not? Helping the client to make the necessary compromises to find the best fit to their most important criteria?

Handling Nested Sets

The only tricky part of this approach is how to handle the nested sets. There are lots of ways to do this, many of which have been thoroughly documented elsewhere. I myself like the visitation number technique, especially for relatively static data sets. This makes it very easy to find matches for some given ATTRIBUTE or any of its children without having to do anything exotic in your SQL.

EDIT: So How Does This Work?

OP asked how do you handle things like number of bedrooms and what do the queries look like? Let's take another example for illustration:

The above shows the nested sets for the CATEGORY "Number of Bedrooms". I've also added the visitation numbers to the diagram. Note the way the visitation numbers work, in particular, note that the left (green) and right (red) numbers for any given attribute value contain the left and right visitation numbers for any subordinate attributes. For example, "2+ Bedrooms" has left and right numbers 6 and 15 respectively. Every attribute that falls under "2+ Bedrooms" has left and right numbers that fall within this range.

So how would you query for a properties with a given descriptor? Let's say we want to find all properties with two or more bedrooms. The SQL for such a query might look something like this:

select P.* 
from PROPERTY P
  inner join DESCRIPTOR D
    on P.id = D.property_id
  inner join ATTRIBUTE A
    on D.attribute_id = A.id
where A.left >= (select X.left from ATTRIBUTE X
                 where X.name = '2+ Bedrooms')
  and A.right <= (select Y.right from ATTRIBUTE Y
                  where Y.name = '2+ Bedrooms')

Note that the above query is a little different that what you might actually use. For example, you'd probably look up the filtering attribute using its int identity key instead of its string name. However, I thought I'd leave it as shown for clarity around the main point, which is you filter by looking not for a specific related attribute, but for any related attributes that fall within your filter range.

If you wanted to filter on multiple attributes, then just add more sub-clauses to your where clause.

这篇关于如何设计数据库来存储属性,通过同义词选择属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆