更聪明的方式吗? [英] Smarter way of doing this?

查看:53
本文介绍了更聪明的方式吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经写了这个函数,choose_by_probability()


它需要一个0.00-1.00范围内的概率列表,它必须是

然后随机选择一个概率并返回它的索引。


想法是有两个列表,一个有值,另一个有

概率。然后从其概率中随机选择该值。


values = [''item 1'',''item 2'',''item 3'',]

概率= [0.66,0.33,0.16]

打印值[choose_by_probability(概率)]

''item 1''




等...


我只是想知道是否有人有更好的做到这一点的方式,因为这看起来好像是b $ b对我来说是平方的?

关于Max M

########## #########


来自随机导入随机


def choose_by_probability(概率):

从概率列表中随机选择一个索引

rnd = random()* sum(概率)

range_start = 0.0

for i in range(len(概率)):

概率=概率[i]

range_end = range_start +概率

如果range_start< rnd< range_end:

返回i

range_start = range_end

概率= []

范围内的f(0, 16):

probabilities.append(1.0 /(f + 1.0))


打印概率


#把它写出来进行视觉测试......

结果= []

我的范围(100):

结果。追加(choose_by_probability(概率))

result.sort()

打印结果

解决方案

文章< 40 ********************* @ dread12.news.tele.dk>,

Max M< ma **@mxm.dk>写道:

我写了这个函数,choose_by_probability()

它需要一个0.00-1.00范围内的概率列表,它必须
然后随机选择一个概率并返回它的索引。

想法是有两个列表,一个有值,另一个有概率。然后从其概率中随机选择该值。

值= [''项目1'',''项目2'',''项目3'',]
概率= [ 0.66,0.33,0.16]
打印值[choose_by_probability(概率)]

''item 1''



等...

我只是想知道是否有人有更好的方法来做这件事,因为这似乎对我来说很明显?




我通常的方式是:

随机导入

def choose_by_probability(values,probabilities):

cumul = 0.0

for prob,item in zip(概率,值):

cumul + = prob

if概述> random.random()* cumul:

selected = item

返回选择

def run_tests(n):

D = {}

for x in xrange(n):

j = choose_by_probability([''item 1'',''item 2'',''item 3'']

,[0.66,0.33,0.17])

D [j] = D.get(j,0)+ 1

s = D.items()

s.sort()

表示k,v表示s:

打印k,浮点数(v )/ n

print" Total:",sum([v for k,v in D.items()])


run_tests(10000 )


问候。 Mel。


Max M< ma ** @ mxm.dk>在留言新闻中写道:< 40 ********************* @ dread12.news.tele.dk> ...

它需要0.00-1.00范围内的概率列表,然后必须随机选择其中一个概率并返回它的索引。



您可能希望查看bisect文档中的GRADES示例。


-jJ只记忆。不要留下足迹。


Max M< ma ** @ mxm.dk>

我写了这个函数,choose_by_probability( )>

它需要0.00-1.00范围内的概率列表,然后它必须随机选择一个概率并返回它的索引。

这个想法是有两个列表,一个是值,另一个是
概率。然后从其概率中随机选择该值。

值= [''项目1'',''项目2'',''项目3'',]
概率= [ 0.66,0.33,0.16]




这是你采用的方法的简化版本:


def select(tab, random = random.random):

cp = 0

r = random()

for i,p in enumerate(tab):

cp + = p

如果cp> r:

返回i

加注异常('概率不加1.0'')


如果你需要多次打电话给大桌子,有很多方法可以加快这个速度。


*重新排列表格以确保最大的概率来

优先。无论是否有意,您的样本数据已经按照这种方式安排(虽然概率加起来超过1.0,应该是

缩减回1.0)。

*使用累积概率表预先汇总总结:

[.50,.25,。15,。10] - > [.50,.75,.90,1.00]

def cselect(ctab,random = random.random):

r = random()

for i,cp in enumerate(ctab):

if cp> r:

返回i

加注例外

*如果表是累积的,大的,二分提供更快的

搜索:


def cselect(ctab,random = random.random,bisect = bisect.bisect):

返回bisect(ctab,random())

*如果概率是整数比率,你可以将

问题减少到O(1)查询:


[.5,.3,.1,.1] - > [0,0,0,0,0,1,1,1,2,3]

def lselect(ltab,choice = random.choice):

返回选择(ltab)

IOW,如果速度是重要的,那么一定要利用数据中任何已知的

结构(排序,累积,整数口粮等)

并且不要计算任何东西两次(预先计算总和)。

Raymond Hettinger


I have written this function, "choose_by_probability()"

It takes a list of probabilities in the range 0.00-1.00, and it must
then randomly select one of the probabilities and return it''s index.

The idea is to have two lists, one with a value, and another with a
probability. The value then gets randomly selected from its probability.

values = [''item 1'',''item 2'',''item 3'',]
probabilities = [0.66, 0.33, 0.16]

print values[choose_by_probability(probabilities)]

''item 1''



etc...

I just wondered if anybody has a better way of doing it, as this seems
nastily squared to me?
regards Max M
###################

from random import random

def choose_by_probability(probabilities):
"Randomly selects an index from a list of probabilities"
rnd = random() * sum(probabilities)
range_start = 0.0
for i in range(len(probabilities)):
probability = probabilities[i]
range_end = range_start + probability
if range_start < rnd < range_end:
return i
range_start = range_end
probabilities = []
for f in range(0,16):
probabilities.append(1.0 / (f+1.0))

print probabilities

# writing it out sorted to do a visual test...
result = []
for i in range(100):
result.append(choose_by_probability(probabilities) )
result.sort()
print result

解决方案

In article <40*********************@dread12.news.tele.dk>,
Max M <ma**@mxm.dk> wrote:

I have written this function, "choose_by_probability()"

It takes a list of probabilities in the range 0.00-1.00, and it must
then randomly select one of the probabilities and return it''s index.

The idea is to have two lists, one with a value, and another with a
probability. The value then gets randomly selected from its probability.

values = [''item 1'',''item 2'',''item 3'',]
probabilities = [0.66, 0.33, 0.16]

print values[choose_by_probability(probabilities)]

''item 1''



etc...

I just wondered if anybody has a better way of doing it, as this seems
nastily squared to me?



My usual way is:
import random

def choose_by_probability (values, probabilities):
cumul = 0.0
for prob, item in zip (probabilities, values):
cumul += prob
if prob > random.random()*cumul:
selected = item
return selected
def run_tests (n):
D = {}
for i in xrange (n):
j = choose_by_probability ([''item 1'', ''item 2'', ''item 3'']
, [0.66, 0.33, 0.17])
D[j] = D.get (j, 0) + 1
s = D.items()
s.sort()
for k, v in s:
print k, float(v)/n
print "Total:", sum([v for k, v in D.items()])

run_tests (10000)

Regards. Mel.


Max M <ma**@mxm.dk> wrote in message news:<40*********************@dread12.news.tele.dk >...

It takes a list of probabilities in the range 0.00-1.00, and it must
then randomly select one of the probabilities and return it''s index.



You may wish to look at the GRADES example in the bisect documentation.

-jJ Take only memories. Leave not even footprints.


Max M <ma**@mxm.dk>

I have written this function, "choose_by_probability()"

It takes a list of probabilities in the range 0.00-1.00, and it must
then randomly select one of the probabilities and return it''s index.

The idea is to have two lists, one with a value, and another with a
probability. The value then gets randomly selected from its probability.

values = [''item 1'',''item 2'',''item 3'',]
probabilities = [0.66, 0.33, 0.16]



Here is a simplified version of the approach you took:

def select(tab, random=random.random):
cp = 0
r = random()
for i, p in enumerate(tab):
cp += p
if cp > r:
return i
raise Exception(''probabilities do not add upto 1.0'')

If you need to call this many times and for large table, there are
ways to speed this up.

* re-arrange the tables to make sure the largest probabilities come
first. Intentionally or not, your sample data is already arranged this
way (though the probabilities add up to more than 1.0 and should be
scaled back to 1.0).
* pre-summarize the summations with a cumulative probability table:
[.50, .25, .15, .10] --> [.50, .75, .90, 1.00]

def cselect(ctab, random=random.random):
r = random()
for i, cp in enumerate(ctab):
if cp > r:
return i
raise Exception
* if the table is cumulative and large, bisection offers a faster
search:

def cselect(ctab, random=random.random, bisect=bisect.bisect):
return bisect(ctab, random())
* if the probabilities come in integer ratios, you could reduce the
problem to an O(1) lookup:

[.5, .3, .1, .1] --> [0,0,0,0,0,1,1,1,2,3]

def lselect(ltab, choice=random.choice):
return choice(ltab)
IOW, if speed is what is important, then be sure to exploit any known
structure in the data (ordering, cumulativeness, integer rations, etc)
and don''t compute anything twice (pre-compute the summation).
Raymond Hettinger


这篇关于更聪明的方式吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆