如何在共同方案上建模数据变量方差? SQL [英] How would you model data variables variance on common scheme? SQL

查看:135
本文介绍了如何在共同方案上建模数据变量方差? SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我最近想了一些东西,我想知道什么是正确的方式来做一些像下面的情况(我敢肯定,这是一个很常见的事情,DB的人做的事情)。让我们假设你有一个产品表,类似这样的东西(MySQL):



<$ p $

p> CREATE TABLE`products`(
`id` int(11)NOT NULL auto_increment,
`product_name`varchar(255)default NULL,
`product_description`文本,
KEY`id`(`id`),
KEY`product_name`(`product_name`)
)ENGINE = MyISAM DEFAULT CHARSET = utf8;

这里没有什么异常。现在让我们说,在不同的表中有一个层次结构的类别,并且有一个单独的表,它绑定多对多关系与产品表 - 所以每个产品属于某种类别(我会省略那些,因为这不是这里的问题)。


现在来到有趣的部分 - 如果每个类别要求产品项目的附加变量集。例如计算机监视器类别中的产品必须具有LCD / CRT枚举字段,屏幕大小枚举等 - 以及一些其他类别,让我们说冰淇淋有一些其他变量,如flavor varchar,shelf storage time int等。




这里的问题在于所有产品都有一组通用的变量(id,name,description和类别)有不同类别不一致的其他变量 - 但是所有产品应该共享公共集,因为最终它们都属于产品组,因此可以查询例如SELECT * FROM products ORDER BY company_id(trivial example,也许不是代表,但你得到的图片)。




现在,我看到severa潜在的决议:


- 表,并为其存储具有适当附加变量的产品 - 愚蠢且不具有查询友好性
_c
$
- 产品表与常用变量保持相同,并且对于每个类别创建一个单独的表与附加变量绑定两个表与JOIN - 归一化,但查询性能和清晰度问题 - 如何从类别(第一表 - 产品)和附加过滤器额外变量(17LCD显示器) - 它将需要SQL JOIN欺骗
employ
$
- products表保持不变,并且添加另一个变量类型文本,例如包含
其他变量的JSON数据 - 紧凑和整洁,但不能用SQL过滤变量




我知道我在这里缺少一些很明显和简单的东西 - 我有点生锈的规范化技术:)





编辑:问这个问题没有成功。但是,在我发布了问题后,我点击了我的一个标签的正常化,并发现几个类似的问题,导致查找泛化专业化关系设计。点的故事是,这必须是我的互联网生活中的第一次发生,标签实际上是有用的在搜索。但是,我仍然想听听你们和你的意见。


edit2 :第二种方法的问题是,我期望约1000个专业。有一个层次结构(1-4级深)的类别和末端节点添加专门的变量 - 它们以约1000的顺序累积,因此添加专用​​表以加入是有点不切实际的。


edit3 :由于我的案例中的大量属性挥发性实体属性值,建议看起来像去的方式。这里来了查询恶梦!

解决方案

您期望多少种产品类型?他们每个都有自己的应用逻辑吗?



你可以做一个称为实体属性值模型的广义模型,但它有一个陷阱,当你试图处理产品的具体属性。简单的搜索查询有时变成真正的恶梦。基本思想是,您有一个表,其中包含产品ID,属性名称(或ID到属性表中)和值。您还可以添加表以保存每种产品类型的模板。所以一组表格会告诉你任何给定的产品它可以有什么属性(可能有有效的值范围),另一组表格会告诉你任何单个产品的值是什么。



我会强烈反对使用这个模型,因为它看起来像一个很好的想法,直到你必须实际实现它。



如果你产品类型数量相当有限,我将使用您的第二个解决方案 - 一个主要产品表,其中包含基本属性,然后为每种特定类型的产品添加表格。


I was thinking about some stuff lately and I was wondering what would be the RIGHT way to do something like the following scenario (I'm sure it is a quite common thing for DB guys to do something like it).

Let's say you have a products table, something like this (MySQL):

CREATE TABLE `products` (
  `id` int(11) NOT NULL auto_increment,
  `product_name` varchar(255) default NULL,
  `product_description` text,
  KEY `id` (`id`),
  KEY `product_name` (`product_name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Nothing out of the ordinary here. Now lets say that there are a hierarchy of categories in a different table, and there is a separate table which binds many-to-many relationships with products table - so that each product belongs to some kind of a category (I'll omit those, because thats not the issue here).

Now comes the interesting part - what IF each of the categories mandates additional set of variables to the product items. For example products in the computer monitors category must have LCD/CRT enum field, screen size enum etc. - and some other category, lets say ice creams have some other variables like flavor varchar, shelf storage time int etc.

The problem herein lies in that all products have a common set of variables (id, name, description and sort of like that), but there are additional variables which are not consistent from category to category - but all products should share common set, because in the end they all belong to the products group, so one can query for example SELECT * FROM products ORDER BY company_id (trivial example, maybe not representative, but you get the picture).

Now, I see severa potential resolutions:
- generate separate table for each product category and store products there with appropriate additional variables - stupid and not query friendly

- product table stays the same with common variables, and for each category create a separate table with additional variables binding two tables with a JOIN - normalized, but query performance and clarity issues - how would one filter down products from category (1st table - products) and additional filter for extra variable (17" LCD monitors ie.) - it would require SQL JOIN trickery

- products table stays the same and add another variable type text that holds for example JSON data that hold additional variables - compact and neat, but can't filter through variables with SQL

I know I'm missing something quite obvious and simple here - I'm a bit rusty on the normalization techniques :)


edit: I've been searching around stackoverflow before asking this question without success. However, after I've posted the question I have clicked on one of my tags 'normalization' and found several similar questions which resulted in to look up 'generalization specialization relational design'. Point of the story is that this must be the first occurrence in my internet life that tags are actually useful in search. However, I would still like to hear from you guys and your opinions.
edit2: The problem with approach no.2 is that I expect somewhere around ~1000 specializations. There is a hierarchy (1-4 level deep) of categories and end nodes add specialized variables - they accumulate in the order of ~1000, so it would be a bit unpractical to add specialized tables to join with.
edit3: Due to the vast number of attribute volatility in my case "entity attribute value" that was suggested looks like the way to go. Here comes query nightmares! Thanks guys.

解决方案

How many product types do you expect? Do they each have their own application logic?

You can do a generalized model called the "entity attribute value" model, but it has a LOT of pitfalls when you're trying to deal with specific properties of a product. Simple search queries turn into real nightmares at times. The basic idea is that you have a table that holds the product ID, property name (or ID into a properties table), and the value. You can also add in tables to hold templates for each product type. So one set of tables would tell you for any given product what properties it can have (possibly along with valid value ranges) and another set of tables would tell you for any individual product what the values are.

I would caution strongly against using this model though, since it seems like a really slick idea until you have to actually implement it.

If you number of product types is reasonably limited, I'd go with your second solution - one main product table with base attributes and then additional tables for each specific type of product.

这篇关于如何在共同方案上建模数据变量方差? SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆