按 IP 地址范围匹配的 MySQL 详细记录摘要 - 需要 mySQL Jedi Knight [英] Summary of MySQL detail records matching by IP address ranges - mySQL Jedi Knight required

查看:60
本文介绍了按 IP 地址范围匹配的 MySQL 详细记录摘要 - 需要 mySQL Jedi Knight的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我必须利用 SO 必须提供的最伟大的 mySQL 思想的所有力量.我必须根据每条记录中的 IP 地址汇总详细记录.场景如下:

So, I have to draw upon all the powers of the greatest mySQL minds that SO has to offer. I have to summarize detail records based on the IP address in each record. Here's the scenario:

简而言之,我们有联盟想知道:我联盟中的哪些学校观看了哪些视频多少次"?在 SQL 术语中,它相当于对详细记录进行计数,并按其可能落入的 IP 范围进行分组.

In short, we have consortiums that want to know: "Which schools within my consortium watched which videos how many times"? In SQL terms, it amounts to COUNTing the detail records, grouped by which IP range it might fall into.

  1. 我们有几个大学联盟 - 每个联盟都有少数不同的学校成员.
  2. 联盟中的每所学校都使用不同的 IP 范围来访问我们为这些学校提供的视频.
  3. IP 范围使用通配符指定,因此每所学校都指定了诸如100.200.35.x、100.201.xx、100.202.39.50 等"之类的内容,每所学校的平均范围数为 10 或 15.
  4. 要汇总的原始文本日志文件已经在数据库中(每个日志条目一行),并且具有访问视频文件的实际 IP 地址.
  5. 有 100 条数百万的详细记录,因此我完全预计这将是一个运行相当长一段时间的缓慢过程.
  6. PHP 脚本可以将通配符分解"为代表的各个 IP,但我担心这将是最终答案,可能需要数周时间才能运行.

(为了简单起见,我只参考被访问的视频文件名并计算它的日志条目,但实际上所有细节,如开始/停止/持续时间等.那里,并将最终成为该解决方案的一部分.)

使用 Consortium 记录如下内容:(除日志详细信息外的所有表格设计均开放给建议):

With Consortium records something like this: (All table designs except log details open to suggestion):

| id|consortium   |
| 10|Ivy League   |
| 20|California   |

学校/IP 记录如下:

And School/IP records something like this:

|  id|school     |consortium_id|
| 101|Harvard    |10           |
| 102|Yale       |10           |
| 103|UCLA       |20           |
| 104|Berkeley   |20           |

| id|school_id|ip_range         |
|  1| 101     |100.200.x.x      |
|  2| 101     |100.201.65.x     |
|  3| 101     |100.202.39.50    |
|  4| 101     |100.202.39.51    |
|  5| 101     |100.200.x.x      |
|  6| 101     |100.201.65.x     |
|  7| 101     |100.202.39.50    |

详细记录如下:

|session     |ip_address     |filename          |
|560554790925|100.202.390.500|history101.mp4    |
|406417611526|43.22.90.5     |newsreel.mp4      |
|650423700223|100.202.39.50  |history101.mp4    |
|650423700223|100.202.50.12  |science101.mp4    |
|513057324209|100.202.39.56  |history101.mp4    |

我喜欢认为我对 mySQL 非常方便,但这个正在扩展它,我希望有人可能提供一个壮观的功能或一组步骤.

I like to think I'm pretty handy with mySQL, but this one is stretching it, and am hoping that there's a spectacular function or set of steps that someone might offer.

推荐答案

使用现有的数据结构,您可以按如下方式进行字符串匹配(但效率不高):

With your existing data structure, you could do string matching as follows (but it's not very efficient):

SELECT   schools.school, detail.filename, COUNT(*)
FROM     schools
    JOIN ipranges ON schools.id = ipranges.school_id
    JOIN detail   ON detail.ip_address LIKE REPLACE(ipranges.ip_range, 'x', '%')
WHERE    schools.consortium_id = ?
GROUP BY schools.school, detail.filename

更好的方法是将您的 IP 范围存储为网络地址和前缀长度:

A better way would be to store your IP ranges as network address and prefix length:

ALTER TABLE ipranges
  ADD COLUMN network INT UNSIGNED,
  ADD COLUMN prefix  TINYINT;
UPDATE ipranges SET
  network = INET_ATON(REPLACE(ip_range, 'x', 0)),
  prefix  = 32 - 8*(CHAR_LENGTH(ip_range) - CHAR_LENGTH(REPLACE(ip_range,'x',''));
ALTER TABLE ipranges
  DROP COLUMN ip_range;

ALTER TABLE detail
  ADD COLUMN ip_address_new INT UNSIGNED;
UPDATE detail SET
  ip_address_new = INET_ATON(ip_address);
ALTER TABLE detail
  DROP COLUMN ip_address,
  CHANGE ip_address_new ip_address INT UNSIGNED;

那么这只是进行一些位比较的情况:

Then it would merely be a case of performing some bit comparisons:

SELECT   schools.school, detail.filename, COUNT(*)
FROM     schools
    JOIN ipranges ON schools.id = ipranges.school_id
    JOIN detail   ON detail.ip_address & ~((1 << 32 - ipranges.prefix) - 1)
                   = ipranges.network
WHERE    schools.consortium_id = ?
GROUP BY schools.school, detail.filename

这篇关于按 IP 地址范围匹配的 MySQL 详细记录摘要 - 需要 mySQL Jedi Knight的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆