MySQL分区/分片/拆分-哪种方法? [英] MySQL Partitioning / Sharding / Splitting - which way to go?

查看:419
本文介绍了MySQL分区/分片/拆分-哪种方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们拥有一个大约70 GB的InnoDB数据库,我们希望在接下来的2到3年中它将增长到数百GB.大约60%的数据属于单个表.目前,数据库运行良好,因为我们有一台具有64 GB RAM的服务器,因此几乎整个数据库都可以容纳到内存中,但是我们担心将来数据量会大大增加.现在,我们正在考虑拆分表的方法(尤其是占数据大部分的表),而我现在想知道,什么是最好的方法.

We have an InnoDB database that is about 70 GB and we expect it to grow to several hundred GB in the next 2 to 3 years. About 60 % of the data belong to a single table. Currently the database is working quite well as we have a server with 64 GB of RAM, so almost the whole database fits into memory, but we’re concerned about the future when the amount of data will be considerably larger. Right now we’re considering some way of splitting up the tables (especially the one that accounts for the biggest part of the data) and I’m now wondering, what would be the best way to do it.

我目前知道的选项是

  • 使用5.1版随附的MySQL分区
  • 使用某种第三方库来封装数据分区(例如休眠分片)
  • 在应用程序中自己实现

我们的应用程序基于J2EE和EJB 2.1构建(希望有一天我们会切换到EJB 3).

Our application is built on J2EE and EJB 2.1 (hopefully we’re switching to EJB 3 some day).

您有什么建议?

编辑(2011-02-11):
只是更新:当前数据库的大小为380 GB,大"表的数据大小为220 GB,其索引的大小为36 GB.因此,尽管整个表不再适合内存使用,但索引却适合.
系统仍然运行良好(仍在相同的硬件上),我们仍在考虑对数据进行分区.

EDIT (2011-02-11):
Just an update: Currently the size of the database is 380 GB, the data size of our "big" table is 220 GB and the size of its index is 36 GB. So while the whole table does not fit in memory any more, the index does.
The system is still performing fine (still on the same hardware) and we're still thinking about partitioning the data.

编辑(2014-06-04): 再更新一次:整个数据库的大小为1.5 TB,我们的大"表的大小为1.1 TB.我们将服务器升级到了具有128 GB RAM的4处理器机器(Intel Xeon E7450). 系统仍然运行良好. 接下来,我们计划将大表放在单独的数据库服务器上(我们已经对软件进行了必要的更改),同时升级到具有256 GB RAM的新硬件.

EDIT (2014-06-04): One more update: The size of the whole database is 1.5 TB, the size of our "big" table is 1.1 TB. We upgraded our server to a 4 processor machine (Intel Xeon E7450) with 128 GB RAM. The system is still performing fine. What we're planning to do next is putting our big table on a separate database server (we've already done the necessary changes in our software) while simultaneously upgrading to new hardware with 256 GB RAM.

此设置应持续两年.然后,我们将不得不最终开始实施分片解决方案,或者仅购买具有1 TB RAM的服务器,这将使我们继续运行一段时间.

This setup is supposed to last for two years. Then we will either have to finally start implementing a sharding solution or just buy servers with 1 TB of RAM which should keep us going for some time.

编辑(2016-01-18):

EDIT (2016-01-18):

此后,我们已将大表放入单独服务器上自己的数据库中.目前,该数据库的大小约为1.9 TB,其他数据库(除大"数据库外的所有表)的大小为1.1 TB.

We have since put our big table in it's own database on a separate server. Currently the size ot this database is about 1.9 TB, the size of the other database (with all tables except for the "big" one) is 1.1 TB.

当前硬件设置:

  • HP ProLiant DL 580
  • 4 x Intel(R)Xeon(R)CPU E7- 4830
  • 256 GB RAM

在此设置下性能不错.

推荐答案

如果您认为自己将受到IO/内存的限制,那么我认为分区将不会有所帮助.像往常一样,首先进行基准测试将帮助您找出最佳方向.如果没有可用的备用服务器,且具有64GB的内存,您可以随时向供应商索取演示单元".

If you think you're going to be IO/memory bound, I don't think partitioning is going to be helpful. As usual, benchmarking first will help you figure out the best direction. If you don't have spare servers with 64GB of memory kicking around, you can always ask your vendor for a 'demo unit'.

如果您不希望有1个查询汇总报告,则我倾向于分片.我假设您将分片整个数据库,而不仅仅是大表:最好将整个实体保持在一起.好吧,如果您的模型能够很好地拆分,那么.

I would lean towards sharding if you don't expect 1 query aggregate reporting. I'm assuming you'd shard the whole database and not just your big table: it's best to keep entire entities together. Well, if your model splits nicely, anyway.

这篇关于MySQL分区/分片/拆分-哪种方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆