Multikey多值非确定性蟒蛇字典 [英] Multikey Multivalue Non Deterministic python dictionary

查看:212
本文介绍了Multikey多值非确定性蟒蛇字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python中已经有一个多重指令,也是一个多值的dict 。我需要一个python字典:



示例:

  #概率地获取任何一个气球,玩具或汽车
d ['red','blue','green'] ==baloon或car或toy

d ['red'] == d ['green']的概率很高,d ['red']!= d的概率['red']很低但可能



单个输出值应根据密钥
的规则进行概率确定(模糊),例如:在上述情况下规则可能是如果键有红色和蓝色,然后返回baloon80%的时间,如果只有蓝色然后返回玩具15%的时间其他汽车5%的时间。



setitem方法应该设计成使得以下可能:

  d [red ,蓝色] = [
(baloon,haseither('red','green'),0.8),
(toy,.....)
, ....
]

以上为predi为字典分配多个值cate函数和相应的概率。而不是上面的任务列表甚至是一个字典,因为作业将是更好的:

  d [red,blue] = {
baloon:haseither('red','green',0.8),
toy:hasonly(blue,0.15),
car 0.05)
}

在上述的气球将返回80%的时间如果红色或绿色现在
,返回玩具15%的时间如果蓝色现在,并返回车5%的时间没有任何条件。



是否有任何现有的数据在python中已经满足上述要求的结构?如果没有,那么如何修改多键盘代码以满足python中的上述要求?



如果使用字典,那么可以有一个配置文件或使用适当的嵌套装饰器来配置注意:上面是一个基于规则的自动应答程序的有用自动机,因此可以让我知道。如果任何类似的基于规则的框架可用于python,即使它不使用字典结构?

解决方案

Simulated MultiKey Dictionary < h1>

multi_key_dict 不允许 __getitem __() 带有多个键...



(例如 d [red ,绿色]



可以使用 tuple 设置 键。如果订单无关紧要, set 似乎是最好的(实际上是可以哈希的 冻结集 ,所以[red,blue] 是一样的 蓝色,红色]



模拟多维词典



值是使用某些数据类型固有的,它可以是任何存储元素,标准 dict 应该提供。



非确定论



使用由规则和假设 1 定义的概率分布,非确定性选择为p使用python docs中的这个食谱进行了更改。



MultiKeyMultiValNonDeterministicDict Class



。    \o / -nice!



此类需要多个键定义多个值的概率规则集。在创建项目期间( __ setitem __() )所有的值概率都被预先计算了所有密钥的组合 1 。在项目访问期间( __ getitem __() )选择预先计算的概率分布,并根据随机加权选择来评估结果。



定义



  import random 
import operator
import bisect
import itertools

#或使用itertools。累积在python 3
def accumulate(iterable,func = operator.add):
'返回运行总计
#accumulate([1,2,3,4,5]) - > 1 3 6 10 15
#accumulate([1,2,3,4,5],operator.mul) - > 1 2 6 24 120
it = iter(iterable)
try:
total = next(it)
除了StopIteration:
return
yield total
中的元素:
total = func(total,element)
yield total

class MultiKeyMultiValNonDeterministicDict(dict):

def key_combination (自我,键):
获取所有组合的键
返回[frozenset(子集)L中的范围(0,len(键)+1)为itertools中的子集。

$ b def multi_val_rule_prob(self,rules,rule):

为每个值分配概率,
传播未定义的结果概率
均匀超过规则未定义的剩余结果

all_results = set([result_probs在results.values()中的结果为result_probs的结果])
prob = rules [rule]
leftover_prob = 1.0 - sum([x for prob.values()])
leftover_results = len(all_results) - len(prob)
for all_results:
if result not in prob:
#spread undefined prob uniform over剩余结果
prob [结果] = leftover_prob / leftover_results
返回prob

def multi_key_rule_prob(self,key,val):

为每个组合键分配概率分布,
使用未在规则集中定义的组合的默认值

combo_probs = {}
for self.key_combinations(key)中的组合:
如果组合val:
result_probs = self.multi_val_rule_prob(val,combo).items()
else:
result_probs = self.multi_val_rule_prob(val,frozenset([]))items()
combo_probs [combo] = result_probs
return combo_probs

def weighted_random_choice(self,weighted_c
从加权分配中进行选择
choice,weights = zip(* weighted_choices)
cumdist = list(accumulate(weights))
返回选项[bisect.bisect(cumdist,random.random()* cumdist [-1])]

def __setitem __(self,key,val):

set项目在字典中,
将值分配给具有预计算概率分布的键


precompute_val_probs = self.multi_key_rule_prob(key,val)
#用于显示所有密钥规则集的预计算概率
#print precompute_val_probs

dict .__ setitem __(self,frozenset(key),precompute_val_probs)

def __getitem __(self,key):

从字典获取项目,
根据规则概率随机选择值

key = frozenset([key])if isinstance(key ,str)else frozenset(k $)
val =无
encrypted_val =无
如果key在self.keys()中:
val = dict .__ getitem __(self,key)
encrypted_val = val [key]
else:
在self.keys()中的k:
如果key.issubset(k):
val = dict .__ getitem __(self,k)
weighted_val = val [key]

#用于显示关键字的概率
#print weighted_val

如果weighted_val:
prob_results = self.weighted_random_choice (加权值)
其他:
prob_results =无
返回prob_results



用法



  d = MultiKeyMultiValNonDeterministicDict()

d [red,blue,green ] = {
#{rule_set}:{result:probability}
frozenset([red,green]):{ballon:0.8},
frozenset blue]):{toy:0.15},
frozenset([]):{car:0.05}
}



测试



检查概率

  N = 10000 
red_green_test = {'car':0.0,'toy':0.0,'ballon':0.0}
red_blue_test = {'car':0.0,'toy':0.0,'ballon':0.0}
blue_test = {'car':0.0,'toy':0.0,'ballon':0.0}
red_blue_green_test = {'car':0.0,'toy':0.0,'ballon':0.0}
default_test = {'car':0.0,'toy':0.0,'ballon':0.0}

for _ in xrange(N):
red_green_test [d [red绿色]] + = 1.0
red_blue_test [d [red,blue]] + = 1.0
blue_test [d [blue]] + = 1.0
default_test [d [green]] + = 1.0
red_blue_green_test [d [red,blue,green]] + = 1.0

print'red,green test =', ''.join('{0}:{1:05.2f}%'。format(key,100.0 * val / N)for key,val in red_green_test.items())
print'red,blue test =',''.join('{0}:{1:05.2f }%'。format(key,100.0 * val / N)for key,val in red_blue_test.items())
print'blue test =',''.join('{0}:{1:05.2 f}%'。format(key,100.0 * val / N)for key,val in blue_test.items())
print'default test =',''.join('{0}:{1: 05.2f}%'。format(key,100.0 * val / N)for key,val in default_test.items())
print'red,blue,green test =',''.join('{0 }:{1:05.2f}%'。format(key,100.0 * val / N)for key,val在red_blue_green_test.items())






  red,green test = car:09.89%toy:10.06%ballon:80.05%
red,blue test = car:05.30%toy:47.71%ballon:46.99%
blue test = car:41.69%toy:15.02%ballon:43.29%
default test = car:05.03%toy :47.16%ballon:47.81%
red,blue,green test = car:04.85%toy:49.20%ballon:45.95%



概率匹配规则!






脚注




  1. 分布假设



    由于规则集未完全定义,因此假设概率分布,其中大部分是在 multi_val_rule_prob()。基本上任何未定义的概率将在剩余的值上均匀分布。这是针对所有键的组合完成的,并为随机加权选择创建一个通用键界面。



    给定示例规则集



      d [red,blue,green] = {
    #{rule_set}概率}
    frozenset([red,green]):{ballon:0.8},
    frozenset([blue]):{toy:0.15},
    frozenset([]):{car:0.05}
    }

    将创建以下分发

     'red'= [('car',0.050),('toy',0.475) ('ballon',0.475)] 
    'green'= [('car',0.050),('toy',0.475),('ballon',0.475)]
    'blue'= [ ('car',0.425),('toy',0.150),('ballon',0.425)]
    'blue,red'= [('car',0.050),('toy',0.475) ,('ballon',0.475)]
    'green,red'= [('car',0.098),('toy',0.098),('ballon',0.800)]
    'blue ,绿色'= [('car',0.050),('toy',0.475),('ballon',0.475)]
    'blue,green,red'= [('car',0.050) ,0.475),('ballon',0.475)]
    default = [('car',0.050),('toy',0.475),('ballon',0.475)]

    如果这不正确,请指教。



There is already a multi key dict in python and also a multivalued dict. I needed a python dictionary which is both:

example:

# probabilistically fetch any one of baloon, toy or car
d['red','blue','green']== "baloon" or "car" or "toy"  

Probability of d['red']==d['green'] is high and Probability of d['red']!=d['red'] is low but possible

the single output value should be probabilistically determined (fuzzy) based on a rule from keys eg:in above case rule could be if keys have both "red" and "blue" then return "baloon" 80% of time if only blue then return "toy" 15% of time else "car" 5% of time.

The setitem method should be designed such that following is possible:

d["red", "blue"] =[
    ("baloon",haseither('red','green'),0.8),
    ("toy",.....)
    ,....
]

Above assigns multiple values to the dictionary with a predicate function and corresponding probability. And instead of the assignment list above even a dictionary as assignment would be preferable:

d["red", "blue"] ={ 
    "baloon": haseither('red','green',0.8),
    "toy": hasonly("blue",0.15),
    "car": default(0.05)
}

In the above baloon will be returned 80% of time if "red" or green is present , return toy 15% of time if blue present and return car 5% of time without any condition.

Are there any existing data structures which already satisfy the above requirements in python? if no then how can multikeydict code be modified to meet the above requirements in python?

if using dictionary then there can be a configuration file or use of appropriate nested decorators which configures the above probabilistic predicate logics without having to hard code if \else statements .

Note: Above is a useful automata for a rule based auto responder application hence do let me know if any similar rule based framework is available in python even if it does not use the dictionary structure?

解决方案

Simulated MultiKey Dictionary

multi_key_dict did not allow __getitem__() with multiple keys at onces...

(e.g. d["red", "green"])

A multi key can be simulated with tuple or set keys. If order does not matter, set seems the best (actually the hashable frozen set, so that ["red", "blue"] is the same a ["blue", "red"].

Simulated MultiVal Dictionary

Multi values are inherent by using certain datatypes, it can be any storage element that may be conveniently indexed. A standard dict should provide that.

Non-determinism

Using a probability distribution defined by the rules and assumptions1, non-deterministic selection is performed using this recipe from the python docs.

MultiKeyMultiValNonDeterministicDict Class

What a name.   \o/-nice!

This class takes multiple keys that define a probabilistic rule set of multiple values. During item creation (__setitem__()) all value probabilities are precomputed for all combinations of keys1. During item access (__getitem__()) the precomputed probability distribution is selected and the result is evaluated based on a random weighted selection.

Definition

import random
import operator
import bisect
import itertools

# or use itertools.accumulate in python 3
def accumulate(iterable, func=operator.add):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
    it = iter(iterable)
    try:
        total = next(it)
    except StopIteration:
        return
    yield total
    for element in it:
        total = func(total, element)
        yield total

class MultiKeyMultiValNonDeterministicDict(dict):

    def key_combinations(self, keys):
        """get all combinations of keys"""
        return [frozenset(subset) for L in range(0, len(keys)+1) for subset in itertools.combinations(keys, L)]

    def multi_val_rule_prob(self, rules, rule):
        """
        assign probabilities for each value, 
        spreading undefined result probabilities
        uniformly over the leftover results not defined by rule.
        """
        all_results = set([result for result_probs in rules.values() for result in result_probs])
        prob = rules[rule]
        leftover_prob = 1.0 - sum([x for x in prob.values()])
        leftover_results = len(all_results) - len(prob)
        for result in all_results:
            if result not in prob:
                # spread undefined prob uniformly over leftover results
                prob[result] = leftover_prob/leftover_results
        return prob

    def multi_key_rule_prob(self, key, val):
        """
        assign probability distributions for every combination of keys,
        using the default for combinations not defined in rule set
        """ 
        combo_probs = {}
        for combo in self.key_combinations(key):
            if combo in val:
                result_probs = self.multi_val_rule_prob(val, combo).items()
            else:
                result_probs = self.multi_val_rule_prob(val, frozenset([])).items()
            combo_probs[combo] = result_probs
        return combo_probs

    def weighted_random_choice(self, weighted_choices):
        """make choice from weighted distribution"""
        choices, weights = zip(*weighted_choices)
        cumdist = list(accumulate(weights))
        return choices[bisect.bisect(cumdist, random.random() * cumdist[-1])]

    def __setitem__(self, key, val):
        """
        set item in dictionary, 
        assigns values to keys with precomputed probability distributions
        """

        precompute_val_probs = self.multi_key_rule_prob(key, val)        
        # use to show ALL precomputed probabilities for key's rule set
        # print precompute_val_probs        

        dict.__setitem__(self, frozenset(key), precompute_val_probs)

    def __getitem__(self, key):
        """
        get item from dictionary, 
        randomly select value based on rule probability
        """
        key = frozenset([key]) if isinstance(key, str) else frozenset(key)             
        val = None
        weighted_val = None        
        if key in self.keys():
            val = dict.__getitem__(self, key)
            weighted_val = val[key]
        else:
            for k in self.keys():
                if key.issubset(k):
                    val = dict.__getitem__(self, k)
                    weighted_val = val[key]

        # used to show probabality for key
        # print weighted_val

        if weighted_val:
            prob_results = self.weighted_random_choice(weighted_val)
        else:
            prob_results = None
        return prob_results

Usage

d = MultiKeyMultiValNonDeterministicDict()

d["red","blue","green"] = {
    # {rule_set} : {result: probability}
    frozenset(["red", "green"]): {"ballon": 0.8},
    frozenset(["blue"]): {"toy": 0.15},
    frozenset([]): {"car": 0.05}
}

Testing

Check the probabilities

N = 10000
red_green_test = {'car':0.0, 'toy':0.0, 'ballon':0.0}
red_blue_test = {'car':0.0, 'toy':0.0, 'ballon':0.0}
blue_test = {'car':0.0, 'toy':0.0, 'ballon':0.0}
red_blue_green_test = {'car':0.0, 'toy':0.0, 'ballon':0.0}
default_test = {'car':0.0, 'toy':0.0, 'ballon':0.0}

for _ in xrange(N):
    red_green_test[d["red","green"]] += 1.0
    red_blue_test[d["red","blue"]] += 1.0
    blue_test[d["blue"]] += 1.0
    default_test[d["green"]] += 1.0
    red_blue_green_test[d["red","blue","green"]] += 1.0

print 'red,green test      =', ' '.join('{0}: {1:05.2f}%'.format(key, 100.0*val/N) for key, val in red_green_test.items())
print 'red,blue test       =', ' '.join('{0}: {1:05.2f}%'.format(key, 100.0*val/N) for key, val in red_blue_test.items())
print 'blue test           =', ' '.join('{0}: {1:05.2f}%'.format(key, 100.0*val/N) for key, val in blue_test.items())
print 'default test        =', ' '.join('{0}: {1:05.2f}%'.format(key, 100.0*val/N) for key, val in default_test.items())
print 'red,blue,green test =', ' '.join('{0}: {1:05.2f}%'.format(key, 100.0*val/N) for key, val in red_blue_green_test.items())


red,green test      = car: 09.89% toy: 10.06% ballon: 80.05%
red,blue test       = car: 05.30% toy: 47.71% ballon: 46.99%
blue test           = car: 41.69% toy: 15.02% ballon: 43.29%
default test        = car: 05.03% toy: 47.16% ballon: 47.81%
red,blue,green test = car: 04.85% toy: 49.20% ballon: 45.95%

Probabilities match rules!


Footnotes

  1. Distribution Assumption

    Since the rule set is not fully defined, assumptions are made about the probability distributions, most of this is done in multi_val_rule_prob(). Basically any undefined probability will be spread uniformly over the remaining values. This is done for all combinations of keys, and creates a generalized key interface for the random weighted selection.

    Given the example rule set

    d["red","blue","green"] = {
        # {rule_set} : {result: probability}
        frozenset(["red", "green"]): {"ballon": 0.8},
        frozenset(["blue"]): {"toy": 0.15},
        frozenset([]): {"car": 0.05}
    }
    

    this will create the following distributions

    'red'           = [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
    'green'         = [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
    'blue'          = [('car', 0.425), ('toy', 0.150), ('ballon', 0.425)]
    'blue,red'      = [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
    'green,red'     = [('car', 0.098), ('toy', 0.098), ('ballon', 0.800)]
    'blue,green'    = [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
    'blue,green,red'= [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
     default        = [('car', 0.050), ('toy', 0.475), ('ballon', 0.475)]
    

    If this is incorrect, please advise.

这篇关于Multikey多值非确定性蟒蛇字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆