使用PRNG分配数据存储区ID [英] Allocating datastore id using PRNG

查看:93
本文介绍了使用PRNG分配数据存储区ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Google Cloud Datastore证明,如果需要预先分配实体ID,则应使用allocateIds方法: https://cloud.google.com/datastore/docs/best-practices#键

Google Cloud Datastore documents that if an entity id needs to be pre-allocated, then one should use the allocateIds method: https://cloud.google.com/datastore/docs/best-practices#keys

该方法似乎进行了具有延迟的REST或RPC调用.我想通过在Kubernetes Engine应用程序中使用PRNG来避免这种延迟.这是scala代码:

That method seems to make a REST or RPC call which has latency. I'd like to avoid that latency by using a PRNG in my Kubernetes Engine application. Here's the scala code:

import java.security.SecureRandom

class RandomFactory {

  protected val r = new SecureRandom

  def randomLong: Long = r.nextLong

  def randomLong(min: Long, max: Long): Long =
    // Unfortunately, Java didn't make Random.internalNextLong public,
    // so we have to get to it in an indirect way.
    r.longs(1, min, max).toArray.head

  // id may be any value in the range (1, MAX_SAFE_INTEGER),
  // so that it can be represented in Javascript.
  // TODO: randomId is used in production, and might be susceptible to
  // TODO: blocking if /dev/random does not contain entropy.
  // TODO: Keep an eye on this concern.
  def randomId: Long =
    randomLong(1, RandomFactory.MAX_SAFE_INTEGER)
}

object RandomFactory extends RandomFactory {

  // MAX_SAFE_INTEGER is es6 Number.MAX_SAFE_INTEGER
  val MAX_SAFE_INTEGER = 9007199254740991L
}

我还计划在Pod中安装haveged,以帮助实现熵.

I also plan to install haveged in the pod to help with entropy.

我了解allocateIds确保没有使用ID.但是在我的特定用例中,有两个缓解因素可以忽略该问题:

I understand allocateIds ensures that an ID is not already in use. But in my particular use case, there are two mitigating factors to overlooking that concern:

  1. 基于实体数量,发生冲突的可能性为1亿分之一.
  2. 这种特定的实体类型不是必需的,可以承受一次蓝月亮"冲突.

我更关心键空间中的均匀分布,因为这是正常的用例关注.

I am more concerned about even distribution in keyspace, because that is normal use case concern.

这种方法是否行得通,特别是在密钥空间中均匀分布的情况下? allocatedIds方法是必不可少的,还是只是可以帮助开发人员避免简单的错误?

Will this approach work, particularly with even distribution in keyspace? Is the allocatedIds method essential, or does it just help developers avoid simple mistakes?

推荐答案

要摆脱冲突,请使用更多位-出于所有实际目的128 [请参阅

To get rid of collisions use more bits -- for all practical purposes 128 [See statistics behind UUID V4] will never generate a collision.

另一种技术是使用较短的随机数插入新实体,并通过使用新ID再次尝试来处理Cloud Datastore返回的错误(如果它们已经存在)(直到发生当前未使用的实体).

Another technique is to insert new entities with a shorter random number and handle the error Cloud Datastore returns if they already exist by trying again with a new ID (until you happen upon one that isn't currently in use).

就密钥分布而言:密钥将在密钥空间内随机分布,这将使Cloud Datastore感到满意.

As far as the key distribution goes: the keys will be randomly distributed within the key space will keep Cloud Datastore happy.

这篇关于使用PRNG分配数据存储区ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆