Mysql InnoDB性能优化和索引 [英] Mysql InnoDB performance optimization and indexing

查看:176
本文介绍了Mysql InnoDB性能优化和索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个数据库,我需要在两个大表之间链接信息(每个表格超过3M条目,不断增长)。
第一个数据库有一个表'pages',用于存储有关网页的各种信息,并包含每个网页的URL。列'URL'是varchar(512)并且没有索引。

I have 2 databases and I need to link information between two big tables (more than 3M entries each, continuously growing). The 1st database has a table 'pages' that stores various information about web pages, and includes the URL of each one. The column 'URL' is a varchar(512) and has no index.

第二个数据库有一个'urlHops'表定义为:

The 2nd database has a table 'urlHops' defined as:

CREATE TABLE urlHops
dest varchar(512)NOT NULL,
src varchar(512)DEFAULT NULL,
timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY dest_key dest ),
KEY src_key src
)ENGINE = InnoDB DEFAULT CHARSET = latin1

CREATE TABLE urlHops ( dest varchar(512) NOT NULL, src varchar(512) DEFAULT NULL, timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, KEY dest_key (dest), KEY src_key (src) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

现在,我基本上需要发出(高效)这样的查询:
从db1.pages p中选择p.id,p.URL,db2.urlHops u其中u.src = p.URL和u.dest =?

Now, I need basically to issue (efficiently) queries like this: select p.id,p.URL from db1.pages p, db2.urlHops u where u.src=p.URL and u.dest=?

首先,我想在页面上添加一个索引(URL)。但它是一个非常长的列,我已经在同一个表上发出了很多INSERT和UPDATE(比我使用这个索引做的SELECT数量多)。

At first, I thought to add an index on pages(URL). But it's a very long column, and I already issue a lot of INSERTs and UPDATEs on the same table (way more than the number of SELECTs I would do using this index).

我认为的其他可能的解决方案是:
-向页面添加一列,存储URL的md5哈希值并对其进行索引;通过这种方式,我可以使用URL的md5进行查询,并且可以使用较小列的索引。
- 添加另一个只包含页面ID和页面URL的表,索引两列。但这可能是浪费空间,只有不会减慢我在页面上执行的插入和更新的优势。

Other possible solutions I thought are: -adding a column to pages, storing the md5 hash of the URL and indexing it; this way I could do queries using the md5 of the URL, with the advantage of an index on a smaller column. -adding another table that contains only page id and page URL, indexing both columns. But this is maybe a waste of space, having only the advantage of not slowing down the inserts and updates I execute on 'pages'.

我不想放慢速度下来插入和更新,但同时我将能够有效地对URL进行查询。任何建议?
我主要关心的是表现;如果需要,浪费一些磁盘空间不是问题。

I don't want to slow down the inserts and updates, but at the same time I would be able to do the queries on the URL efficiently. Any advice? My primary concern is performance; if needed, wasting some disk space is not a problem.

谢谢,问候

Davide

推荐答案

如果URL的页面是一对一的关系,并且该表具有唯一的ID(主键?),则可以存储urlHops表中的src和dest字段中的id值,而不是完整的URL。

If pages to URL's is a 1-to-1 relationship and that table has a unique id (primary key?), you could store that id value in the src and dest fields in the urlHops table instead of the full URL.

这会使索引和连接效率更高。

This would make indexing and joins much more efficient.

这篇关于Mysql InnoDB性能优化和索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆