性能:具有冗余数据的连接表与大表 [英] performance: joining tables vs. large table with redundant data

查看:157
本文介绍了性能:具有冗余数据的连接表与大表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以说我有很多产品.每个产品的ID,价格和详细说明均由多个段落组成.每个产品还将有多个sku编号,分别代表不同的尺寸和颜色.

Lets say i have a bunch of products. Each product has and id, price, and long description made up of multiple paragraphs. Each product would also have multiple sku numbers that would represent different sizes and colors.

要说明一下:product_id 1有3个skus,product_id 2有5个skus.产品1中的所有skus都具有相同的价格和描述.产品2的价格和描述与产品1不同.所有产品2的skus都共享产品2的价格和描述.

To clarify: product_id 1 has 3 skus, product_id 2 has 5 skus. All of the skus in product 1 share the same price and description. product 2 has a different price and description than product 1. All of product 2's skus share product 2's price and description.

我可以有一张大桌子,每个SKU都有不同的记录.记录中将包含冗长的字段,例如详细说明和价格.

I could have a large table with different records for each sku. The records would have redundant fields like the long description and price.

或者我可以有两个表.一个名为product的产品,其名称为product_id,价格和说明.还有一个名为"skus"的产品,带有product_id,sku,颜色和大小.然后,我将加入product_id列上的表.

Or I could have two tables. One named "products" with product_id, price, and description. And one named "skus" with product_id, sku, color, and size. I would then join the tables on the product_id column.

$query = "SELECT * FROM skus LEFT OUTER JOIN products ON skus.product_id=products.product_id WHERE color='green'";

$query = "SELECT * FROM master_table WHERE color='green'";

这是我的安装程序的精简版.最后,将有更多的专栏和很多产品.哪种方法会有更好的性能?

This is a dumbed down version of my setup. In the end there will be a lot more columns and a lot of products. Which method would have better performance?

因此,要更具体一点:假设我想在long_description列上搜索所有skus.我正在尝试比较一个具有5000个long_description和5000 skus的表与OUTER JOINing两个表,一个具有1000个long_description记录,另一个具有5000 skus.

So to be more specific: Let's say I want to LIKE search on the long_description column for all of the skus. I am trying to compare having one table that has 5000 long_description and 5000 skus vs OUTER JOINing two tables, one has 1000 long_description records and the other has 5000 skus.

推荐答案

这取决于那些表的用法-为了获得明确的答案,您应该同时使用这两个表并使用代表性数据集/系统使用率进行比较.

It depends on the usage of those tables - in order to get a definitive answer you should do both and compare using representative data sets / system usage.

通常的方法是仅对数据进行非规范化处理,以解决您遇到的特定性能问题,因此在这种情况下,我的建议是默认情况下跨两个表进行联接,并且如果您有性能问题,并发现非规范化可以解决该问题.

The normal approach is to only denormalise data in order to combat specific performance problems that you are having, so in this case my advice would be to default to joining across two tables and only denormalise to using a single table if you have a performance problem and find that denormalisation fixes it.

这篇关于性能:具有冗余数据的连接表与大表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆