大表性能与带有某些联接的小表性能 [英] A big table performance vs a small table with some joins
问题描述
我想知道我是否有一个大表(50列和5千万条记录),我想使用选择查询,以及是否将我的大表拆分为一个较小的表(20列和5千万条记录),并且有一些联接到一些小表(约5列)中,我想使用相同的选择,从速度上讲,哪种方式更好?
I want to know if I have a big table (50 columns and 50 millions records) and I want to use select query, and if I split my big table to a smaller table (20 columns and 50 millions records) with some joins in some small tables (about 5 columns) and I want to use the same select, which of these manners is better in terms of speed?
例如:
具有巨大表的第一种情况:
First case with a huge table:
select username,password,nationalAdd from baseTBL
第二种情况,表较小,并且有一些联接:
Second case with a smaller table and some joins:
select username,password,nationalAdd from baseTBL
inner join dependTBL1 on baseTBL.user_id=dependTBL1.id
inner join dependTBL2 on baseTBL.national_id=dependTBL2.id
推荐答案
通常的答案是:没有性能上的好处垂直分区。另外,很难从水平分区中获得性能上的好处。
Usually the answer is: no performance benefit to Vertical partitioning. Also, it is rare to get a performance benefit from Horizontal partitioning.
在这里我将介绍垂直分区。
I'll address Vertical here.
- 不要跨列展开数组。例如:家庭/工作/手机号码;多个地址。最好与另一个表建立1:1:1的关系。
- 如果您存在一些很少存在的列,请考虑对其进行垂直分区,但要保留行而不是使用
NULLs
。 - 如果您要让
WHERE
子句在之后触及两个表分区,这确实对性能不利。将所有搜索条件保留在主分区中。 - 如果某列中包含很多重复的值,并且这些值都是冗长的,请考虑对这些值进行规范化。 (这很像垂直分区,但它被认为是不同的东西。)
- 不要过度规范化。示例:将city + state + postal_code + country分解为一个
Location
表,而不是4个表。
- Don't splay an array across columns. Examples: home/work/cell phone numbers; multiple addresses. Better to have a 1:many relationship to another table.
- If you have some columns that are "rarely" present, consider vertical partitioning for them, but leave out the rows instead of having
NULLs
. - If you will be having a
WHERE
clause touching both tables after ther partitioning, this is really bad for performance. Keep all the search criteria in the main partition. - If some column has a lot of repeated values, and the values are 'lengthy', consider 'normalizing' those values. (This works a lot like vertical partitioning, but it is considered a different thingie.)
- Don't over-normalize. Example: break out city+state+postal_code+country into a single
Location
table, not 4 tables.
底线:您的问题太含糊,无法为您提供适用于您的模式的答案。
Bottom line: Your question is too vague to give you the answer that applies to your schema.
这篇关于大表性能与带有某些联接的小表性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!