查找所有乘积大于阈值的列表的所有笛卡尔乘积的树 [英] Tree to find all cartesian product of lists whose product is greater than a threshold

查看:102
本文介绍了查找所有乘积大于阈值的列表的所有笛卡尔乘积的树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们以如下列表为例:

  li = [[0.99,0.002],
[0.98,0.0008,0.0007],
[0.97,0.009,0.001],
[0.86,0.001]]

请注意,每个子列表中的元素均按降序排序,并且它们的总和始终小于或等于1。此外,子列表本身也按其第一个元素的降序排序。



我感兴趣的是找到组合,从每个子列表中选取一个元素,使组合元素的乘积高于某个阈值,例如1e-5。我发现执行此操作的一种方法是使用itertools.product。

  a = list(itertools.product(* li) )
[if np.prod(item)> 1e-5中的项的项目]

但是,由于我的实际列表中有太多子列表,因此此过程对我来说不可行,因此要检查的可能组合数量太大。



而不是首先找到所有组合并检查阈值条件,我必须做相反的事情,即仅找到满足给定条件的组合。例如:由于0.002 * 0.0008 * 0.009已经小于1e-5,因此我可以忽略以(0.002,0.0008,0.009,...)开头的所有其他组合。



我找不到实现此目的的简便方法。我想到的是一个树数据结构,在其中构建一棵树,以便每个节点都可以跟踪产品,并且当节点值低于1e-5时,我将停止在该节点上进一步构建树,并且在右边的节点上(因为右边的节点将比当前节点小)。



一个简单的树骨架开始:

  class Tree(object):
def __init __(self,node = None):
self.node =节点
self.children = []

def add_child(孩子):
self.children.append(child)

一旦树被构建,我将提取达到 depth = len(li)

的组合code>





任何帮助建立这样的树或任何其他想法解决问题将受到高度赞赏。谢谢!

解决方案

由于您的商品及其子商品都已排序,并且介于0和1之间,因此itertools.product的输出不会增加。数学。正如您所指出的,这并不奇怪,但是您如何利用它呢……



我认为您想要的是itertools.product的复制品,带有一个快捷方式产品低于阈值时立即修剪分支机构。这样一来,您就可以有效地迭代所有可能的匹配项,而不必浪费时间重新检查您已经知道不符合阈值的产品。



我找到了一个迭代器实现itertools.product此处:如何编码类似于python 2.5中的itertools.product的功能(我正在使用python 3,它似乎可以正常工作。)



所以我只是复制了它,并在循环中插入阈值检查

 #终止函数
from functools import减少
来自操作员导入mul

阈值= 1e-5

def截止(args):
如果args:
return reduce(mul,args)<阈值
返回False

#itertools.product的替代实现$ cut
def product(* args,** kwds):
def cycle(values,uplevel) :
作为上级前缀:#循环遍历所有上级
,如果cutoff(prefix):
中断
表示当前值:#重新启动当前级的迭代
结果=前缀+(当前)
(如果cutoff(结果):
中断
收益结果

堆栈= iter((((),))
为元组中的级别(map(tuple,args))* kwds.get('repeat',1):
stack = cycle(level,stack)#构建迭代器堆栈
返回堆栈

#这里的代码
li = [[0.99,0.002],
[0.98,0.0008,0.0007],
[0.97,0.009,0.001],
[0.86,0.001]]

for a in product(* li):
p = reduce(mul,a)
print(p,a)

如果我忽略了临界值,则得到相同的结果,只是稍后检查p>阈值。


(0.99,0.98,0.97,0.86)0.8093408399999998

(0.99,0.98,0.97,0.001)0.0009410939999999998

(0.99,0.98,0.009,0.86)0.007509348

(0.99,0.98,0.001,0.86)0.0008343719999999999

(0.99,0.0008,0.97,0.86)0.0006606864

(0.99,0.0007,0.97,0.86)0.0005781006

(0.002,0.98,0.97,0.86)0.0016350319999999998

(0.002,0.98,0.009,0.86)1.5170399999999998e-05



Let's take an example list of lists like this:

li=[[0.99, 0.002],
 [0.98, 0.0008, 0.0007],
 [0.97, 0.009, 0.001],
 [0.86, 0.001]]

Note that elements inside each sublist are sorted in descending order and their sum is always less than or equal to 1. Also, the sublists themselves are sorted in descending order of their first elements.

I am interested to find combinations, taking one element from each sublist such that the product of the elements of the combination is above a certain threshold, say 1e-5. One way that I found of doing this is by using itertools.product.

a = list(itertools.product(*li))
[item for item in a if np.prod(item)>1e-5]

But, this procedure is not feasible for me since my actual list has too many sublists and so the number of possible combinations to check is too big.

Instead of first finding all combinations and checking for the threshold condition, I must do the opposite i.e. only find combinations that satisfy the given condition. For example: since 0.002*0.0008*0.009 is already less than 1e-5, I can ignore all other combinations that start with (0.002, 0.0008,0.009,...).

I could not find an easy way to implement this. What I have in mind is a tree data structure, where I build a tree such that each node will keep track of the product and as soon as a node value is below 1e-5, I stop building further the tree on that node and also on nodes that are to it's right (since the nodes on the right will be smaller than the current node).

A simple tree skeleton to get started:

class Tree(object):
    def __init__(self, node=None):
        self.node = node
        self.children = []

    def add_child(self, child):
        self.children.append(child)

Once, the tree is built, I would then extract the combination that reached the depth = len(li)

Any help to build such a tree or any other ideas towards solving the problem would be highly appreciated. Thanks!

解决方案

Because your items and their subitems are all sorted and between 0 and 1, the output from itertools.product is nonincreasing. Math. No surprise there as you pointed that out, but how do you take advantage of that ...

I think what you want is a duplication of itertools.product with a shortcut to prune the branch as soon as the product goes under the threshold. This will allow you to efficiently iterate through all possible matches without wasting time re-checking products that you already know can't meet the threshold.

I found an iterator implementation of itertools.product here: how code a function similar to itertools.product in python 2.5 (I'm using python 3, and it seems to work okay.)

so I just copied it, and inserted a threshold check inside the loops

# cutoff function
from functools import reduce
from operator import mul

threshold = 1e-5

def cutoff(args):
    if args:
        return reduce(mul, args) < threshold
    return False

# alternative implementation of itertools.product with cutoff
def product(*args, **kwds):
    def cycle(values, uplevel):
        for prefix in uplevel:       # cycle through all upper levels
            if cutoff(prefix):
                break
            for current in values:   # restart iteration of current level
                result = prefix + (current,)
                if cutoff(result):
                    break
                yield result

    stack = iter(((),))             
    for level in tuple(map(tuple, args)) * kwds.get('repeat', 1):
        stack = cycle(level, stack)  # build stack of iterators
    return stack

# your code here
li=[[0.99, 0.002],
    [0.98, 0.0008, 0.0007],
    [0.97, 0.009, 0.001],
    [0.86, 0.001]]

for a in product(*li):
    p = reduce(mul, a)
    print (p, a)

I get the same results if I leave out the cutoff, and just check p > threshold later.

(0.99, 0.98, 0.97, 0.86) 0.8093408399999998
(0.99, 0.98, 0.97, 0.001) 0.0009410939999999998
(0.99, 0.98, 0.009, 0.86) 0.007509348
(0.99, 0.98, 0.001, 0.86) 0.0008343719999999999
(0.99, 0.0008, 0.97, 0.86) 0.0006606864
(0.99, 0.0007, 0.97, 0.86) 0.0005781006
(0.002, 0.98, 0.97, 0.86) 0.0016350319999999998
(0.002, 0.98, 0.009, 0.86) 1.5170399999999998e-05

这篇关于查找所有乘积大于阈值的列表的所有笛卡尔乘积的树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆