分组和更新大型数据库表 [英] Grouping and update large database table

查看:40
本文介绍了分组和更新大型数据库表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 1.61 亿行的大表.表结构为entryid(varchar), dater(date), region(char), detail(varchar), views(int),前四列一起作为主键.我想做以下处理:

I have a huge table containing 161 million rows. The table structure is entryid(varchar), dater(date), region(char), detail(varchar), views(int), and the first four columns are together as the primary key. I want to do the following processing:

  1. detail 包括 URL,例如 www.google.comwww.google.cawww.google.ukwww.facebook.comfacebook.com 等.我想对来自同一域的这些行进行分组,例如,使用 LIKE "%.google.%" 对于 googleLIKE "%.facebook.%" OR LIKE "facebook.%" 对于 facebook 已经足够好了.但是如果更新行,可能会有重复的键,如何使用ON DUPLICATE KEY UPDATE来总结views?除了 LIKE 之外,还有没有更快的方法?

  1. detail includes URL such as www.google.com, www.google.ca, www.google.uk, www.facebook.com, facebook.com, etc. I want to group those rows from the same domain, for example, using LIKE "%.google.%" for google and LIKE "%.facebook.%" OR LIKE "facebook.%" for facebook are good enough. But if update the rows, there might be duplicate key, how to use ON DUPLICATE KEY UPDATE to sum the views? And is there a faster way other than LIKE?

我想删除 region 列并对行进行分组并对 views 求和并插入到新表中.如何在一个命令中做到这一点?

I want to remove the region column and group the rows and sum the views and insert into a new table. How to do that in one command?

我想按月对行进行分组(并更新 views).

I want to group the rows (and update the views) by month.

推荐答案

试试这个,对于 MYSQL,假设 每一 行,detail 字段看起来像 www.something.com.

Try this, for MYSQL, assuming that for every row, the detail field looks like www.something.com.

SELECT
   MONTH(dater),
   SUBSTRING_INDEX(SUBSTRING_INDEX(detail, '.', 2), -1) as url,
   sum(views)
from table
group by 
   MONTH(dater),
   SUBSTRING_INDEX(SUBSTRING_INDEX(detail, '.', 2), -1)

或者您可以使用子查询使其看起来更简洁:

Or you could use a subquery to make it look a bit cleaner:

SELECT mth, url, sum(views) from
(
SELECT
   MONTH(dater) as mth,
   SUBSTRING_INDEX(SUBSTRING_INDEX(detail, '.', 2), -1) as url,
   views
from table) subq
group by 
   mth, url

这篇关于分组和更新大型数据库表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆