以原子方式递增 Firebase 实时数据库上的值的速度有多快? [英] How quickly can you atomically increment a value on the Firebase Realtime Database?

查看:31
本文介绍了以原子方式递增 Firebase 实时数据库上的值的速度有多快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

firebaser 在这里

当我最近推特关于新的increment()操作符时在 Firebase 实时数据库中,一位团队成员询问 increment() 有多快.

When I recently tweeted about the new increment() operator in the Firebase Realtime Database, a team mate asked how fast increment() is.

我一直在想同样的问题:使用 increment(1) 增加一个值的速度有多快?这与使用交易增加值相比如何?

I've been wondering the same: how fast can you increment a value with increment(1)? And how does that compare to using a transaction to increment a value?

推荐答案

TL;DR

我测试了这些案例:

TL;DR

I tested these cases:

  1. 使用 transaction 调用增加一个值:

ref.transaction(function(value) {
  return (value || 0) + 1;
});

  • 使用新的 increment 运算符增加一个值:

    ref.set(admin.database.ServerValue.increment(1));
    

  • 增量更快这一事实并不令人意外,但是……增加了多少?

    The fact that increment is faster won't be a surprise, but... by how much?

    结果:

    • 通过事务,我能够以每秒 60-70 次的速度递增一个值.
    • 使用 increment 操作符,我能够以每秒 200-300 次的速度递增一个值.
    • With transactions I was able to increment a value about 60-70 times per second.
    • With the increment operator, I was able to increment a value about 200-300 times per second.

    我已经在我的 2016 款 MacBook pro 上运行了测试,并将上述内容包装在一个使用 客户端节点 SDK.操作的包装脚本也非常基础:

    I've run the test on my 2016 model macBook pro, and wrapping the above in a simple Node.js script that uses the client-side Node SDK. The wrapping script for the operations was really basic as well:

    timer = setInterval(function() {
        ... the increment or transaction from above ...
    }, 100);
    
    setTimeout(function() {
      clearInterval(timer);
      process.exit(1);
    }, 60000)
    

    所以:每秒增加值 10 次,并在 1 分钟后停止这样做.然后我用这个脚本生成了这个过程的实例:

    So: increment the value 10 times per second, and stop doing that after 1 minute. I then spawned instances of this process with this script:

    for instance in {1..10}
    do
      node increment.js &
    done
    

    因此,这将使用 increment 运算符运行 10 个并行进程,每个进程每秒增加 10 次值,总共每秒 100 次增量.然后我更改了实例数,直到每秒增量"达到峰值.

    So this would run 10 parallel processes with the increment operator, each increasing the value 10 times per second, for a total of 100 increments per second. I then changed the number of instances until the "increments per second" reached a peak.

    然后我在jsbin上写了一个小的脚本来监听值,并通过简单的低通移动平均滤波器确定每秒的增量数.我在这里遇到了一些麻烦,所以不确定计算是否完全正确.鉴于我的测试结果,它们非常接近,但如果有人想写一个更好的观察者:做我的客人.:)

    I then wrote a small script on jsbin to listen for the value, and determine the number of increments per second by a simple low pass, moving average filter. I had some trouble here, so am not sure if the calculations are completely correct. Given my test results they were close close enough, but if anyone feels like writing a better observer: be my guest. :)

    关于测试的注意事项:

    1. 我不断增加进程数,直到每秒增量"似乎达到最大值,但我注意到这恰逢我的笔记本电脑风扇全速运转.所以很可能我没有找到服务器端操作的真正最大吞吐量,而是我的测试环境和服务器的组合.因此,当您尝试重现此测试时,很有可能(实际上很可能)您可能会得到不同的结果,尽管 increment 吞吐量应该始终显着高于 transaction.无论你得到什么结果:请分享它们.:)

    1. I kept increasing the number of processes, until the "increments per second" seemed to max out, but I noticed that this coincided with my laptop fans going full-speed. So it's likely that I didn't find the true maximum throughput of the server-side operation, but a combination of my test environment and the server. So it is quite possible (and in fact likely) you may get different results when you try to reproduce this test, although of course the increment throughput should always be significantly higher than the transaction. No matter what results you get: please share them. :)

    我使用了客户端 Node.js SDK,因为它最容易上手.使用不同的 SDK 可能会产生略有不同的结果,尽管我希望主要的 SDK(iOS、Android 和 Web)与我得到的非常接近.

    I've used the client-side Node.js SDK, as it was easiest to get working. Using different SDKs may give slightly different results, although I expect the primary SDKs (iOS, Android, and Web) to be quite close to what I got.

    两个不同的团队成员立即询问我是在单个节点上运行它,还是并行增加多个值.并行递增多个值可能会显示是否存在系统范围的吞吐量瓶颈,或者它是否特定于节点(我期望).

    Two different team mates immediately asked whether I'd run this on a single node, or if I was incrementing multiple values in parallel. Incrementing multiple values in parallel might show if there's a system-wide throughput bottleneck in or if it is node-specific (which I expect).

    如前所述:我的测试工具没什么特别的,但我的 jsbin 观察者代码特别可疑.如果有人想在相同的数据上编写一个更好的观察者,那就太好了.

    As said already: my test harness is nothing special, but my jsbin observer code is especially suspect. Kudos if anyone feels like coding up a better observer on the same data.

    <小时>

    事务和增量运算符的工作原理

    要了解 transactionincrement 之间的性能差异,了解这些操作在幕后如何工作确实很有帮助.对于 Firebase 实时数据库来说,幕后"意味着通过 Web Socket 连接在客户端和服务器之间发送的命令和响应.


    How the transaction and increment operator work under the hood

    To understand the performance difference between transaction and increment it really helps to know how these operations work under the hood. For the Firebase Realtime Database "under the hood" means, the commands and responses that are sent between the clients and server over the Web Socket connection.

    交易使用比较并设置的方法.每当我们像上面那样开始事务时,客户端都会猜测节点的当前值.如果在猜测是 null 之前从未看到节点.它用这个猜测调用我们的事务处理程序,然后我们的代码返回新值.客户端将猜测和新值发送到服务器,服务器执行比较和设置操作:如果猜测正确,则设置新值.如果猜测错误,则服务器拒绝该操作,并将实际当前值返回给客户端.

    Transactions in Firebase use a compare-and-set approach. Whenever we start transaction like above, the client takes a guess at the current value of the node. If it's never see the node before that guess is null. It calls our transaction handler with that guess, and our code then returns the new value. The client send the guess and the new value to the server, which performs a compare-and-set operation: if the guess is correct, set the new value. If the guess is wrong, the server rejects the operation and returns the actual current value to the client.

    在完美的场景中,初始猜测是正确的,并且该值会立即写入服务器上的磁盘(然后发送给所有侦听器).在如下所示的流程图中:

    In a perfect scenario, the initial guess is correct, and the value is immediately written to disk on the server (and after that, sent out to all listeners). In a flow chart that'd look like this:

                Client            Server
    
                   +                   +
     transaction() |                   |
                   |                   |
            null   |                   |
         +---<-----+                   |
         |         |                   |
         +--->-----+                   |
             1     |     (null, 1)     |
                   +--------->---------+
                   |                   |
                   +---------<---------+
                   |     (ack, 3)      |
                   |                   |
                   v                   v
    

    但是如果节点已经在服务器上有一个值,它会拒绝写入,发回实际值,然后客户端再次尝试:

    But if the node already has a value on the server, it rejects the write, sends back the actual value, and the client tries again:

                Client            Server
    
                   +                   +
     transaction() |                   |
                   |                   |
            null   |                   |
         +---<-----+                   |
         |         |                   |
         +--->-----+                   |
             1     |                   |
                   |     (null, 1)     |
                   +--------->---------+
                   |                   |
                   +---------<---------+
                   |     (nack, 2)     |
                   |                   |
             2     |                   |
         +---<-----+                   |
         |         |                   |
         +--->-----+                   |
             3     |      (2, 3)       |
                   +--------->---------+
                   |                   |
                   +---------<---------+
                   |      (ack, 3)     |
                   |                   |
                   |                   |
                   v                   v
    

    这还不错,一次额外的往返.即使 Firebase 使用悲观锁定,它也需要往返,所以我们没有丢失任何东西.

    This isn't too bad, one extra roundtrip. Even if Firebase would've used pessimistic locking, it would have needed that roundtrip, so we didn't lose anything.

    如果多个客户端同时修改相同的值,问题就会开始.这在节点上引入了所谓的争用,如下所示:

    The problem starts if multiple clients are modifying the same value concurrently. This introduces so-called contention on the node, which looks like this:

                Client            Server                Client
                   +                   +                   +
     transaction() |                   |                   |
                   |                   |                   | transaction()
            null   |                   |                   |
         +---<-----+                   |                   |  null
         |         |                   |                   +--->----+
         +--->-----+                   |                   |        |
             1     |                   |                   +---<----+ 
                   |     (null, 1)     |                   |   1
                   +--------->---------+    (null, 1)      |
                   |                   |---------<---------+
                   +---------<---------+                   |
                   |     (nack, 2)     |--------->---------+
                   |                   |     (nack, 2)     |
             2     |                   |                   |
         +---<-----+                   |                   |   2
         |         |                   |                   |--->----+
         +--->-----+                   |                   |        |
             3     |      (2, 3)       |                   |---<----+ 
                   +--------->---------+                   |   3
                   |                   |                   |
                   +---------<---------+                   |
                   |      (ack, 3)     |       (2, 3)      |
                   |                   |---------<---------+
                   |                   |                   |
                   |                   |--------->---------+
                   |                   |    (nack, 3)      |
                   |                   |                   |   3
                   |                   |                   |--->----+
                   |                   |                   |        |
                   |                   |                   |---<----+ 
                   |                   |                   |   4
                   |                   |       (3, 4)      |
                   |                   |---------<---------+
                   |                   |                   |
                   |                   |--------->---------+
                   |                   |     (ack, 4)      |
                   |                   |                   |
                   v                   v                   v
    

    TODO:更新上面的图表,使服务器上的操作不重叠.

    第二个客户端必须再次重试其操作,因为服务器端值在第一次和第二次尝试之间已被修改.我们向该位置写入的客户端越多,您看到重试的可能性就越大.Firebase 客户端会自动执行这些重试,但在多次重试后,它会放弃并向应用程序引发 Error: maxretry 异常.

    The second client had to do another retry for its operation, because the server-side value had been modified between its first and second try. The more clients we have writing to this location, the more likely it is that you'll see retries. And the Firebase client performs those retries automatically, but after a number of retries it will give up and raise an Error: maxretry exception to the application.

    这就是我每秒只能将计数器递增 60-70 次的原因:写入次数比这更多时,节点上的争用过多.

    This is the reason I could only increment a counter about 60-70 times per second: with more writes than that, there was too much contention on the node.

    增量操作本质上是原子的.您告诉数据库:无论当前值是多少,都将其 x 设置得更高.这意味着客户端永远不必知道节点的当前值,因此它也不会猜错.它只是告诉服务器要做什么.

    An increment operation is atomic by nature. You're telling the database: whatever the current value is, make it x higher. This means that the client never has to know the current value of the node, and so it also can't guess wrong. It simply tells the server what to do.

    当使用increment时,我们的多客户端流程图如下所示:

    Our flow chart with multiple clients looks like this when using increment:

                Client            Server                Client
    
                   +                   +                   +
      increment(1) |                   |                   |
                   |                   |                   | increment(1)
                   |  (increment, 1)   |                   |
                   +--------->---------+   (increment, 1)  |
                   |                   |---------<---------+
                   +---------<---------+                   |
                   |      (ack, 2)     |--------->---------+
                   |                   |     (ack, 3)      |
                   |                   |                   |
                   v                   v                   v
    

    仅这最后两个流程图的长度就可以很好地解释为什么在这种情况下increment要快得多:increment 操作就是为此而进行的,所以有线协议更接近地代表我们正在努力完成的事情.仅在我的简单测试中,这种简单性就导致了 3 到 4 倍的性能差异,在生产场景中甚至可能更大.

    The length of these last two flow charts alone already goes a long way to explain why increment is so much faster in this scenario: the increment operation is made for this, so the wire protocol much more closely represents what we're trying to accomplish. And that simplicity leads to a 3x-4x performance difference in my simple test alone, and probably even more in production scenarios.

    当然事务仍然有用,因为原子操作不仅仅是增加/减少.

    Of course transactions are still useful, as there are many more atomic operations than just increments/decrements.

    这篇关于以原子方式递增 Firebase 实时数据库上的值的速度有多快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆