为多个数据库列建立索引的成本是多少? [英] What is the cost of indexing multiple db columns?

查看:122
本文介绍了为多个数据库列建立索引的成本是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个带有索引3列的MySQL表的应用程序.我担心在表达到大量记录之后,保存新记录的时间会很慢.请告知如何最好地处理列的索引.

I'm writing an app with a MySQL table that indexes 3 columns. I'm concerned that after the table reaches a significant amount of records, the time to save a new record will be slow. Please inform how best to approach the indexing of columns.

更新

我正在索引一个point_value, user_id和event_id,都必须 用于面向客户的目的.为 例如打棒球得分 按玩家ID和游戏ID.什么会 插入约200个新的成本 在桌子放满后记录一天 连续两个赛季的记录,例如72,000 运行,并在5个季节后,也许 25万条记录?仅适用于 插图,但我期望 在25至200条记录之间插入 一天.

I am indexing a point_value, the user_id, and an event_id, all required for client-facing purposes. For an instance such as scoring baseball runs by player id and game id. What would be the cost of inserting about 200 new records a day, after the table holds records for two seasons, say 72,000 runs, and after 5 seasons, maybe a quarter million records? Only for illustration, but I'm expecting to insert between 25 and 200 records a day.

推荐答案

为最合理的索引(最好是显而易见的,例如CUSTOMERS表中的customer ID列).

Index what seems the most logical (that should hopefully be obvious, for example, a customer ID column in the CUSTOMERS table).

然后运行您的应用程序并定期收集统计信息,以查看数据库的性能. DB2上的RUNSTATS是一个示例,我希望MySQL也有类似的工具.

Then run your application and collect statistics periodically to see how the database is performing. RUNSTATS on DB2 is one example, I would hope MySQL has a similar tool.

当您发现一些经常运行的查询进行全表扫描(或者由于其他原因花费的时间太长)时,并且只有这样,您应该添加更多索引.优化每月一次在午夜运行的查询的效果不好,因此它可以在12:05而不是12:07结束.但是,将面向客户的查询时间从5秒减少到2秒是一个巨大的进步(这仍然太慢,如果可能的话,面向客户的查询应该在1秒以内).

When you find some oft-run queries doing full table scans (or taking too long for other reasons), then, and only then, should you add more indexes. It does little good to optimise a once-a-month-run-at-midnight query so it can finish at 12:05 instead of 12:07. However, it's a huge improvement to reduce a customer-facing query from 5 seconds down to 2 seconds (that's still too slow, customer-facing queries should be sub-second if possible).

更多索引往往会降低插入速度并加快查询速度.因此,这始终是一种平衡行为.这就是为什么您仅在针对问题的特定响应中添加索引.任何其他事情都是过早的优化,应该避免.

More indexes tend to slow down inserts and speed up queries. So it's always a balancing act. That's why you only add indexes in specific response to a problem. Anything else is premature optimization and should be avoided.

此外,请定期重新访问已有的索引,以查看是否仍需要它们.可能是导致您添加这些索引的查询不再足够频繁地运行以保证其安全.

In addition, revisit the indexes you already have periodically to see if they're still needed. It may be that the queries that caused you to add those indexes are no longer run often enough to warrant it.

说实话,除非您计划存储非常多的行,否则我不认为对表上的三列建立索引会给您带来麻烦:-)-索引非常有效.

To be honest, I don't believe indexing three columns on a table will cause you to suffer unless you plan on storing really huge numbers of rows :-) - indexing is pretty efficient.

编辑后显示以下内容:

我正在索引point_valueuser_idevent_id,这些都是面向客户端的目的.例如,根据球员ID和游戏ID对棒球计分.在一张桌子拥有两个赛季的记录(例如72,000次运行)并且在五个季节之后,可能有25万条记录之后,每天插入约200条新记录的成本是多少?仅出于说明目的,但我希望每天插入25到200条记录.

I am indexing a point_value, the user_id, and an event_id, all required for client-facing purposes. For an instance such as scoring baseball runs by player id and game id. What would be the cost of inserting about 200 new records a day, after the table holds records for two seasons, say 72,000 runs, and after 5 seasons, maybe a quarter million records? Only for illustration, but I'm expecting to insert between 25 and 200 records a day.

我的回答是,每天200条记录对于数据库来说是一个很小的价值,对于这三个索引,您绝对不需要担心.

My response is that 200 records a day is an extremely small value for a database, you definitely won't have anything to worry about with those three indexes.

就在本周,我将价值一天的交易导入到一个工作的数据库表中,其中包含210万条记录(我们每天从25台不同的机器上至少每秒获得一笔交易).而且它具有四个单独的组合键,比您的三个单独键强度更高.

Just this week, I imported a days worth of transactions into one of our database tables at work and it contained 2.1 million records (we get at least one transaction per second across the entire day from 25 separate machines). And it has four separate composite keys which is somewhat more intensive than your three individual keys.

现在可以肯定的是,它位于DB2数据库上,但是我无法想象IBM比MySQL的人好得多,因为MySQL只能处理少于0.01%的DB2负载.

Now granted, that's on a DB2 database but I can't imagine IBM are so much better than the MySQL people that MySQL can only handle less than 0.01% of the DB2 load.

这篇关于为多个数据库列建立索引的成本是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆