如何在大数据库中优化数据库这个查询? [英] How to optimize database this query in large database?

查看:210
本文介绍了如何在大数据库中优化数据库这个查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查询

Query

SELECT id FROM `user_tmp` 
WHERE  `code` = '9s5xs1sy' 
  AND  `go` NOT REGEXP 'http://www.xxxx.example.com/aflam/|http://xx.example.com|http://www.xxxxx..example.com/aflam/|http://www.xxxxxx.example.com/v/|http://www.xxxxxx.example.com/vb/'  
  AND check='done'  
  AND  `dataip` <1319992460
ORDER BY id DESC 
LIMIT 50



返回:

MySQL returns:

Showing rows 0 - 29 ( 50 total, Query took 21.3102 sec) [id: 2622270 - 2602288]

查询耗时21.3102秒

if i remove

if i remove

AND dataip <1319992460

AND dataip < 1319992460

MySQL returns

MySQL返回

Query took 0.0859 sec

查询耗时0.0859秒

如果没有数据,MySQL返回

MySQL returned an empty result set (i.e. zero rows). ( Query took 21.7332 sec )

Query took 21.7332 sec 

查询耗时21.7332秒

解释计划:

Explain plan:

  SQL query: Explain SELECT * FROM `user_tmp` WHERE `code` = '93mhco3s5y' AND `too` NOT REGEXP 'http://www.10neen.com/aflam/|http://3ltool.com|http://www.10neen.com/aflam/|http://www.10neen.com/v/|http://www.m1-w3d.com/vb/' and checkopen='2010' and `dataip` <1319992460 ORDER BY id DESC LIMIT 50;
    Rows: 1
    id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
    1   SIMPLE  user_tmp    index   NULL    PRIMARY     4   NULL    50  Using where

使用数据库的示例


CREATE TABLE IF NOT EXISTS user_tmp id int(9)NOT NULL
AUTO_INCREMENT, ip text NOT NULL, dataip bigint(20)NOT NULL,
ref text NOT NULL, click int(20)NOT NULL, code text NOT
NULL, code> too text NOT NULL, name text NOT NULL, b $ b text NOT NULL, contry text NOT NULL, vOperation text NOT NULL,
vBrowser text NOT NULL, iconOperation text NOT NULL,

iconBrowser text NOT NULL,

CREATE TABLE IF NOT EXISTS user_tmp ( id int(9) NOT NULL AUTO_INCREMENT, ip text NOT NULL, dataip bigint(20) NOT NULL, ref text NOT NULL, click int(20) NOT NULL, code text NOT NULL, too text NOT NULL, name text NOT NULL, checkopen text NOT NULL, contry text NOT NULL, vOperation text NOT NULL, vBrowser text NOT NULL, iconOperation text NOT NULL,
iconBrowser text NOT NULL,



  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=4653425 ;

-

INSERT INTO `user_tmp` (`id`, `ip`, `dataip`, `ref`, `click`, `code`, `too`, `name`, `checkopen`, `contry`, `vOperation`, `vBrowser`, `iconOperation`, `iconBrowser`) VALUES
(1, '54.125.78.84', 1319506641, 'http://xxxx.example.com/vb/showthread.php%D8%AA%D8%AD%D9%85%D9%8A%D9%84-%D8%A7%D8%BA%D9%86%D9%8A%D8%A9-%D8%A7%D9%84%D8%A8%D9%88%D9%85-giovanni-marradi-lovers-rendezvous-3cd-1999-a-155712.html', 0, '4mxxxxx5', 'http://www.xxx.example.com/aflam/', 'xxxxe', '2010', 'US', 'Linux', 'Chrome 12.0.742 ', 'linux.png', 'chrome.png');

我想要正确的方式来执行查询和优化数据库

I want the correct way to do the query and optimize database

推荐答案

我将假设你使用的是mysql< = 5.1

I will assume you are using mysql <= 5.1

上面的答案分为两个基本类别:
1.您使用的列类型错误
2.您需要索引

The answers above fall into two basic categories: 1. You are using the wrong column type 2. You need indexes

我将处理每个都与性能相关最终我会提出你的问题:

I will deal with each as both are relevant for performance which is ultimately what I take your questions to be about:


  1. 列类型
    bigint / int或int / char对于dataip问题基本上与你的问题无关。基本问题与索引策略有关。但是,当整体考虑性能时,您使用MyISAM作为此表的引擎的事实导致我询问您是否真的需要文本列类型。如果你有短(小于255个)字符列,那么将它们固定长度的列将很可能提高性能。请注意,如果任何一列的长度可变(varchar,text等),那么这不值得更改任何列。

  1. Column Types The difference between bigint/int or int/char for the dataip question is basically not relevant to your issue. The fundamental issue has more to do with index strategy. However when considering performance holistically, the fact that you are using MyISAM as your engine for this table leads me to ask if you really need "text" column types. If you have short (less than 255 say) character columns, then making them fixed length columns will most likely increase performance. Keep in mind that if any one column is of variable length (varchar, text, etc) then this is not worth changing any of them.

垂直分区
这里要记住的事实是,即使你只是请求 id 列从磁盘IO和内存的角度你得到整行回。由于这么多行是文本,这可能意味着大量的数据。不用于查找用户或不经常访问的任何这些行可以移动到另一个表中,其中外键具有保持关系1:1的唯一键。

Vertical Partitioning The fact to keep in mind here is that even though you are only requesting the id column from the standpoint of disk IO and memory you are getting the entire row back. Since so many of the rows are text, this could mean a massive amount of data. Any of these rows that are not used for lookups of users or are not often accessed could be moved into another table where the foreign key has a unique key placed on it keeping the relationship 1:1.

索引策略
最可能的问题是如上所述的索引。添加AND dataip < 1319992460条件导致当前情况的原因是它强制执行全表扫描。

Index Strategy Most likely the problem is simply indexing as is noted above. The reason that your current situation is caused by adding the "AND dataip <1319992460" condition is that it forces a full table scan.

如上所述,将where子句中的所有列放在单个复合索引中将有所帮助。

As stated above placing all the columns in the where clause in a single, composite index will help. The order of the columns in the index will no matter so long as all of them appear in the where clause.

但是,对于其他查询,顺序可能非常重要。一个快速示例是由(colA,colB)构成的索引。具有where colA ='foo'的查询将使用此索引。但是具有where colB ='bar'的查询将不会,因为colB不是索引定义中最左侧的列。因此,如果有其他查询在某些组合中使用这些列,那么值得最小化在表上创建的索引的数量。这是b / c每个索引增加写入的成本,并使用磁盘空间。写入是昂贵的b / c必需的磁盘活动。不要让它们更贵。

However, the order could matter a great deal for other queries. A quick example would be an index made of (colA, colB). A query with "where colA = 'foo'" will use this index. But a query with "where colB = 'bar'" will not because colB is not the left most column in the index definition. So, if you have other queries that use these columns in some combination it is worth minimizing the number of indexes created on the table. This is b/c every index increases the cost of a write and uses disk space. Writes are expensive b/c of necessary disk activity. Don't make them more expensive.

这篇关于如何在大数据库中优化数据库这个查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆