大容量更新计数器Mysql [英] High Volume Updating Counters Mysql

查看:77
本文介绍了大容量更新计数器Mysql的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含很多计数器表的站点:

I have a site which contains a lot of counters tables:

收藏夹关注上投票下投票等。

现在每次单击我都将更新/插入这些表中的一行。

Now at each click I am updating/inserting one row in those tables .

收藏夹表

user_id|obj_id|time

跟随表

user_id|obj_id|time

每次单击我都必须进行更新/插入,即 1点击= 1对服务器的ajax请求= 1更新/插入,这可能导致对数据库的大量调用。

At each click I have to do an update/insert i.e 1 click = 1 ajax request to server = 1 update/insert this can cause a lot of calls to the Database.

根据数学(据我所知)并采用最少的计数:

According to Maths (as much I know) and taking the least counts:

1 click = 1 request
1 million click = 1 million request
100 million click/day 

= 100 million request/day
= 416k request/hour
= 6830 request/min
= 115 request/sec

有没有更好的方法可以做到这一点?

Is there any better way I can do this ?

推荐答案

您是否将客户端从服务器拆分到单独的计算机上?这是扩展的第一步。

Have you split the client off into a separate machine from the server? That is a first, minor, step in scaling.

您是否已将复制查询和只读查询发送到从站?这样可以实现无限的 read 缩放。 (但这并不能解决UPDATE问题,只能减轻主服务器上的负载。)

Do you have replication and readonly queries sent to Slaves? That can allow for unlimited read scaling. (But this does not address the UPDATE question, other than to lighten the load on the Master.)

单个旋转磁盘上的115个IOP会使它饱和很多。 innodb_flush_log_at_trx_commit的默认值为1,这导致每个事务至少有1个IOP。一些临时解决方案(直到流量再增长10倍)...

115 IOPs on a single, spinning, disk will pretty much saturate it. innodb_flush_log_at_trx_commit defaults to 1, which leads to at least 1 IOP per transaction. Some temporary solutions (until your traffic grows by another 10x)...

SSD-可能是1000个IOP。

SSDs -- maybe 1000 IOPs.

批处理更新(如@NB提到的那样),这将冲洗次数减少了100倍。

Batch the updates (such as mentioned by @N. B.) This cuts by 100x the number of "flushes".

innodb_flush_log_at_trx_commit = 2-实际上消除了冲洗(

innodb_flush_log_at_trx_commit = 2 -- to virtually eliminate the flushes (at some loss of security).

但是-即使您可以足够快地执行UPDATE,您是否也不需要读取值?也就是说,会有争执。您正在执行 same 表中的几个SELECT? 100 /秒可能没问题; 1000 /秒可能会造成太多干扰,以致无法正常工作。

But -- Even if you can do the UPDATEs fast enough, don't you also need to read the values? That is, there will be contention. How many SELECTs on the same table are you doing? 100/sec might be ok; 1000/sec might cause so much interference that it won't work.

桌子有多大?为了使所有这些都能正常工作,它必须足够小以便可以一直被缓存。

How big is the table? For any of this to work, it needs to be small enough to be cached all the time.

Reddit是另一种方法-在那里捕获更新。然后不断提取累积的计数并进行所需的更新。

Reddit is another approach -- capture the updates there. Then continually pull out accumulated counts and do the needed UPDATEs.

共享-在这里,您可以将数据分散到多台计算机上。通常在用户ID的哈希或查找(或两者的组合)上进行拆分。然后,UPDATE需要确定要更新的计算机,然后在此处执行操作。如果您有10个分片(机器),则可以承受近10倍的更新速率。最终,这是所有沉重的打击者每天都可以处理1亿+用户和数十亿查询的唯一方法。

Sharding -- This is where you split up the data across multiple machines. Splitting on a hash or lookup (or combo of the two) of the userid is common. Then the UPDATE needs to figure out which machine to update, then perform the action there. If you have 10 shards (machines), you can sustain nearly 10 times the update rate. Ultimately, this is the only way that all the heavy hitters can handle 100M+ users and billions of queries/day.

PARTITIONing可能无济于事。分区修剪代码的效率还不够高,无法避免此类微小查询的过多开销。

PARTITIONing is not likely to help. The partition pruning code is not yet efficient enough to avoid too much overhead for such a tiny query.

这篇关于大容量更新计数器Mysql的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆