范围查询的有效数据模型 [英] Efficient data model for range queries
问题描述
我正在努力考虑一种有效的模型来描述IPv4地址数据.我希望能够在MySQL内的数据集上执行"whois"类型查找.目前我有这个:
I'm struggling to think of an efficient model to describe IPv4 address data. I want to be able to perform a 'whois' type lookup on a dataset within MySQL. Currently I have this:
CREATE TABLE inetnum (
`from_ip` int(11) unsigned NOT NULL,
`to_ip` int(11) unsigned NOT NULL,
`netname` varchar(40) default NULL,
`ip_txt` varchar(60) default NULL,
`descr` varchar(60) default NULL,
`country` varchar(2) default NULL,
`recurse_limit` int(11) NOT NULL default '0',
`unexpected` int(11) NOT NULL default '0',
`rir` enum('APNIC','AFRINIC','ARIN','RIPE','LACNIC') NOT NULL default 'RIPE',
PRIMARY KEY (`from_ip`,`to_ip`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii;
我想这样查询:
SELECT *
FROM inetnum
WHERE INET_ATON('192.168.0.1') BETWEEN from_ip AND to_ip;
但是由于地址范围的上限和下限位于不同的字段中,因此会导致全表扫描:
But because the upper and lower bounds of the address range are held in different fields, this results in a full table scan:
mysql> EXPLAIN SELECT * FROM `inetnum` WHERE INET_ATON('192.168.0.1') BETWEEN from_ip AND to_ip;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | inetnum | ALL | NULL | NULL | NULL | NULL | 3800440 | Using where |
+----+-------------+---------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
(并且我确定有人会指出-不是因为INET_ATON函数-使用文字整数没有区别,使用< = to_ip AND> = from_ip也没有区别).
(and as I'm sure someone will try to point out - not it's not because of the INET_ATON function - using a literal integer makes no difference, nor does using <=to_ip AND >=from_ip).
当前正在MySQL 5.0.67上运行.我只有有限的范围来更改/升级DBMS.
This is currently running on MySQL 5.0.67. I only have limited scope for changing/upgrading the DBMS.
推荐答案
I found a solution (using spatial data types) here on Stack overflow - but note that the solution is not the accepted answer - it's the one from Quassnoi
请投票以重复结束我的问题.
Please vote to close my question as a duplicate.
但是对于任何在家中尝试此操作的人来说,由于我已经有了一个数据表,这会带来额外的复杂性,因此我使用的配方略有不同:
But for anyone trying this at home - there was an additional complication as I already had a table of data - hence I'm using a slightly different recipe:
mysql> alter table inetnum add column netrange linestring;
Query OK, 3800440 rows affected (22.41 sec)
Records: 3800440 Duplicates: 0 Warnings: 0
mysql> create spatial index rangelookup on inetnum(netrange);
ERROR 1252 (42000): All parts of a SPATIAL index must be NOT NULL
mysql> UPDATE inetnum
-> SET netrange=GeomFromText(CONCAT('LINESTRING(', from_ip, ' -1, ', to_ip, ' 1)'))
-> ;
Query OK, 3800440 rows affected (57.42 sec)
Rows matched: 3800440 Changed: 3800440 Warnings: 0
mysql> create spatial index rangelookup on inetnum(netrange);
ERROR 1252 (42000): All parts of a SPATIAL index must be NOT NULL
mysql> alter table inetnum modify netrange linestring not null;
Query OK, 3800440 rows affected (35.84 sec)
Records: 3800440 Duplicates: 0 Warnings: 0
mysql> create spatial index rangelookup on inetnum(netrange);
Query OK, 3800440 rows affected (1 min 19.69 sec)
Records: 3800440 Duplicates: 0 Warnings: 0
mysql> SELECT COUNT(*)
-> FROM inetnum
-> WHERE INET_ATON('88.104.22.241') BETWEEN from_ip AND to_ip;
+----------+
| COUNT(*) |
+----------+
| 3 |
+----------+
1 row in set (1.19 sec)
mysql> SELECT COUNT(*)
-> FROM inetnum
-> WHERE MBRCONTAINS(netrange, GEOMFROMTEXT(CONCAT('POINT(', INET_ATON('88.104.22.241'), ' 0)')));
+----------+
| COUNT(*) |
+----------+
| 10 |
+----------+
1 row in set (0.06 sec)
这篇关于范围查询的有效数据模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!