我应该如何从基于PRNG的生成过渡到基于哈希的过程生成? [英] How should I move from PRNG based generation to hash-based procedural generation?
问题描述
我想将现有的基于随机数的数据生成器(在Python中)替换为基于散列的数据生成器,以便不再需要按
I want to replace an existing random number based data generator (in Python) with a hash based one so that it no longer needs to generate everything in sequence, as inspired by this article.
我可以通过获取哈希的整数版本并将其除以哈希的最大值来创建从0到1的浮点数.
I can create a float from 0 to 1 by taking the integer version of the hash and dividing it by the maximum value of a hash.
我可以通过采用浮点数并将其乘以平坦范围来创建平坦整数范围.我可能会使用取模并忍受偏差,因为哈希范围很大而平坦范围很小.
I can create a flat integer range by taking the float and multiplying by the flat range. I could probably use modulo and live with the bias, as the hash range is large and my flat ranges are small.
如何使用哈希创建高斯或正态分布的浮点值?
How could I use the hash to create a gaussian or normal distributed floating point value?
对于所有这些情况,仅将哈希作为新的random.Random对象的种子,并使用该类中的函数来生成我的数字并依靠它们来获得正确的分布特征会更好吗?
For all of these cases, would I be better off just using my hash as a seed for a new random.Random object and using the functions in that class to generate my numbers and rely on them to get the distribution characteristics right?
此刻,我的代码结构如下:
At the moment, my code is structured like this:
num_people = randint(1,100)
people = [dict() for x in range(num_people)]
for person in people:
person['surname'] = choice(surname_list)
person['forename'] = choice(forename_list)
问题在于,要使给定的种子保持一致,我必须以相同的顺序生成所有人员,并且必须生成姓氏然后再生成姓氏.如果我在两者之间添加中间名,则生成的形式将发生变化,随后所有人员的所有名称也将发生变化.
The problem is that for a given seed to be consistent, I have to generate all the people in the same order, and I have to generate the surname then the forename. If I add a middle name in between the two then the generated forenames will change, as will all the names of all the subsequent people.
我想像这样构造代码:
h1_groupseed=1
h2_peoplecount=1
h2_people=2
h4_surname=1
h4_forename=2
num_people = pghash([h1_groupseed,h2_peoplecount]).hashint(1,100)
people = [dict() for x in range(num_people)]
for h3_index, person in enumerate(people,1):
person['surname'] = surname_list[pghash([h1_groupseed,h2_people,h3_index,h4_surname]).hashint(0, num_of_surnames - 1)]
person['forename'] = forename_list[pghash([h1_groupseed,h2_people,h3_index,h4_forename]).hashint(0, num_of_forenames - 1)]
这将使用传递给pghash的值生成哈希,并使用该哈希以某种方式创建伪随机结果.
This would use the values passed to pghash to generate a hash, and use that hash to somehow create the pseudorandom result.
推荐答案
我已经着手为random.Random类中的某些函数创建了一个基于哈希的简单替换:
I have gone ahead and created a simple hash-based replacement for some of the functions in the random.Random class:
from __future__ import division
import xxhash
from numpy import sqrt, log, sin, cos, pi
def gaussian(u1, u2):
z1 = sqrt(-2*log(u1))*cos(2*pi*u2)
z2 = sqrt(-2*log(u1))*sin(2*pi*u2)
return z1,z2
class pghash:
def __init__(self, tuple, seed=0, sep=','):
self.hex = xxhash.xxh64(sep.join(tuple), seed=seed).hexdigest()
def pgvalue(self):
return int(self.hex, 16)
def pghalves(self):
return self.hex[:8], self.hex[8:]
def pgvalues(self):
return int(self.hex[:8], 16), int(self.hex[8:], 16)
def random(self):
return self.value() / 2**64
def randint(self, min, max):
return int(self.random() * max + min)
def gauss(self, mu, sigma):
xx = self.pgvalues()
uu = [xx[0]/2**32, xx[1]/2**32]
return gaussian(uu[0],uu[1])[0]
下一步是遍历我的代码,并将所有对random.Random方法的调用替换为pghash对象.
Next step is to go through my code and replace all the calls to random.Random methods with pghash objects.
我已将其制成一个模块,希望在某个时候将其上传到pypi: https://github.com/UKHomeOffice/python-pghash
I have made this into a module, which I hope to upload to pypi at some point: https://github.com/UKHomeOffice/python-pghash
这篇关于我应该如何从基于PRNG的生成过渡到基于哈希的过程生成?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!