高效算法,可针对特定目标组成有效的表达式 [英] Efficient Algorithm to compose valid expressions with specific target

查看:54
本文介绍了高效算法,可针对特定目标组成有效的表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该问题表示为:给定一个仅包含数字0-9和目标值的字符串,请返回通过在数字之间添加一些二进制运算符(+,-或*)而创建的所有表达式,以使它们求和为目标值。在某些情况下,可能没有任何二进制运算符将创建有效的表达式,在这种情况下,该函数应返回一个空数组。新表达式中的数字不应包含前导零。

The problem is stated as: Given a string that contains only digits 0-9 and a target value, return all expressions that are created by adding some binary operators (+, -, or *) between the digits so they evaluate to the target value. In some cases there may not be any binary operators that will create valid expressions, in which case the function should return an empty array. The numbers in the new expressions should not contain leading zeros.

该函数应返回所有求值的有效表达式,按字典顺序排序。

The function should return all valid expressions that evaluate to target, sorted lexicographically.

例如:

位数= 123 并且目标= 6 ,应返回: [ 1 * 2 * 3, 1 + 2 + 3]

digits = "123" and target = 6, should return: ["1*2*3", "1+2+3"]

我当前的算法如下。它有点慢,所以我正在寻找一种更有效的方法来解决该问题。我目前的算法产生操作数和运算符的所有组合。对于上面的示例,它产生

My current algorithm is below. Its a bit slow so I am looking for a more efficient way to approach the problem. My current algo produces all combinations of the operands and operators. For the above example, it produces

操作数:

[['1', '2', '3'], ['1', '23'], ['12', '3'], ['123']]

操作员:

{0: [()], 1: [('+',), ('-',), ('*',)], 2: [('+', '+'), ('+', '-'), ('+', '*'), ('-', '+'), ('-', '-'), ('-', '*'), ('*', '+'), ('*', '-'), ('*', '*')]}

然后,它将所有可能的操作数和运算符组合组合在一起,并对每个值进行求值。

It then combines all possible combinations of operands and operators and evaluate each.

数字的约束为 2≤digits.length≤10。 / code>因此,它还算不错,但是使用这种算法,长度为10的数字大约需要4.3秒,而最长只需要4秒。

The digits has a constraint of 2 ≤ digits.length ≤ 10. So its not that bad, but with this algo it takes around 4.3 seconds for a digit with length of 10, where it should just take 4 secs (maximum).

我还尝试使用以下替代方法来加速eval()函数:

I also tried speeding up the eval() function using with the following alternatives:

if eval(temp) == target:

exp_as_func = eval('lambda: ' + temp)
if exp_as_func() == target:

compiled = compile(temp, '<string>', 'eval')
if compiled == target:

使用Python 3仍需要花费几乎相同的时间。

All of them still takes about the same amount of time using Python 3.

代码:

import itertools
import time

def getValidExp(digits, target):    
    def getSign_combination(length):
        signCombo = {}
        for i in range(0, length):
            signCombo[i] = [c for c in itertools.product(('+', '-', '*'), repeat=i)]
        return signCombo

    def generate_combination(source, comb):
        res = []
        for x, action in zip(source, comb + (0,)):      
            res.append(x)
            if action == 0:
                #####IF ITS A 0, YIELD STRING.  IF NOT COMBINE NEXT ONE
                yield "".join(res)
                res = []


    #####PRODUCT GENERATES (0,0,1).  ALL COMBINATIONS.  0 MEANS BY ITSELF, 1 APPEND NEXT ITEM.

    elementCombo = [list(generate_combination(digits, c)) for c in itertools.product((0, 1), repeat=len(digits) - 1)]

    signCombo = getSign_combination(len(digits))

    result = []
    for e in elementCombo:
        signs = signCombo[len(e)-1]

        for i,sign in enumerate(signs):

            temp = [ item for tple in zip(e, sign) for item in tple ]
            temp.append(e[-1])
            temp = "".join(temp)

            try:
                if eval(temp) == target:
                    result.append(temp)
            except:
                pass

    return sorted(result)

digits = "3456237490"
target = 9180
print("Answer:", getValidExp(digits, target))

使用计算器函数的代码(不带eval( )),几乎具有相同的速度:

Code using a calculator function (no eval()), almost has the same speed:

from itertools import combinations, permutations
import itertools
import time

def getValidExp(digits, target):

    def calculate(s):
        operands, operators = [], []
        operand = ""
        for i in reversed(range(len(s))):
            if s[i].isdigit():
                operand += s[i]
                if i == 0 or not s[i - 1].isdigit():
                    operands.append(int(operand[::-1]))
                    operand = ""
            elif s[i] == '*':
                operators.append(s[i])
            elif s[i] == '+' or s[i] == '-':
                while operators and operators[-1] == '*':
                    compute(operands, operators)
                operators.append(s[i])

        while operators:
            compute(operands, operators)

        return operands[-1]

    def compute(operands, operators):
        left, right = operands.pop(), operands.pop()
        op = operators.pop()
        if op == '+':
            operands.append(left + right)
        elif op == '-':
            operands.append(left - right)
        elif op == '*':
            operands.append(left * right)

    def getSign_combination(length):
        signCombo = {}
        for i in range(0, length):
            signCombo[i] = [c for c in itertools.product(('+', '-', '*'), repeat=i)]
        return signCombo

    def generate_combination(source, comb):
        res = []
        for x, action in zip(source, comb + (0,)):
            res.append(x)
            if action == 0:
                yield "".join(res)
                res = []

    start = time.clock()

    #####PRODUCT GENERATES (0,0,1).  ALL COMBINATIONS.  0 MEANS BY ITSELF, 1 APPEND NEXT ITEM.
    elementCombo = [list(generate_combination(digits, c)) for c in itertools.product((0, 1), repeat=len(digits) - 1)]

    signCombo = getSign_combination(len(digits))

    result = []
    for e in elementCombo:
        signs = signCombo[len(e)-1]
        for i,sign in enumerate(signs):
            temp = ""
            valid = True

            for num in e:
                if num[0] == '0' and len(num) > 1:
                    valid = False
                    break

            if valid:
                for num,operator in zip(e,sign):
                    temp += num
                    temp += operator

                temp += e[-1]

                ####USING CALCULATOR CODE
                if calculate(temp) == target:
                    result.append(temp)

    print(time.clock() - start)
    return sorted(result)


digits = "3456237490"
target = 9180
print("Answer:", getValidExp(digits, target))


推荐答案

面对这种编程挑战,我首先尝试回答以下问题:

With this kind of programming challenge, I start by trying to answer the questions:


  • 应如何表示表达式?

  • 我们可以减少可能的表达式数量吗?

  • 我们可以为它减少工作量吗?每个表达式?

看起来像小型编程语言的问题往往使我觉得Lisp。问题是要求我们生成序列:

Problems that look like small programming languages tend to make me think Lisp. The problem is asking us to generate the series:

123
(* 12 3)
(+ 12 3)
...
(- (- 1 2) 3)

基本上是一个三元组的二元表达式,(运算符,左,右),其中左和右也可以是表达式。组件的顺序实际上并不重要。 Python有元组,在 operator 模块中,它具有用于各种二进制操作的函数。因此,我打算以以下形式构建表达式:

A binary expression in basically a 3-tuple of (operator, left, right) where left and right can also be expressions. The order of the components doesn't actually matter. Python has tuples, and in the operator module it has functions for the various binary ops. So, I'd plan to build expressions in the following form:

(operator.sub, (operator.sub, 1, 2), 3)

然后可以使用(主要是)简单的递归函数求值:

Which can then be evaluated with a (mostly) simple recursive function:

def compute(expr):
    if isinstance(expr, tuple):
            op, left, right = expr
            return op(compute(left), compute(right))
    return expr



减少可能性



从问题描述中看,给定的数字似乎可能以指数形式出现。我们可以通过创建所有排列来消除其中的某些部分吗?

Reducing possibilities

From the problem description, it seems there will be an exponential number of possible expressions per digit given. Can we eliminate some of these part way through creating all the permutations?

例如,输入一个六位数的数字,而目标结果 5 。在创建排列的过程中,假设从前四位数字创建了以下表达式,剩下两个要处理:

For example, take a six digit input and the target result 5. During the process of creating the permutations, imagine the following expression has been created from the first four digits, and there are two left to be handled:

(* 42 81) '??'

3696 是一个很大的数字,这时的任何表达式是否甚至都能够得到 5 的结果?我们可以完全跳过创建它们吗?

3696 is a big number, are any of the expressions from this point even capable of getting a result of just 5? Can we skip creating them altogether?

不幸的是,末尾的数字仍然可以做出重大更改:

Unfortunately, digits near the end can still make major changes:

(+ (* (* 42 81) 0) 5)

可能会有一些分支我们可以避免,但是我们将不得不考虑大多数表达式。

There may be some branches we could avoid, but we're going to have to consider most expressions.

好吧,考虑到我们实际上必须得到大量表达式的结果,还有其他节省精力的方法吗?

Okay, given we'll have to actually get the result of a very large number of expressions, is there some other way to save effort?

让我们想象一下我们是在生成序列的过程中的一部分,这三个最终表达式是一个接一个地生成的:

Lets imagine we're part way through generating a sequence, with these three final expressions generated one after the other:

...
(* (- 8 (* 3 6)) 1)
(+ (- 8 (* 3 6)) 1)
(- (- 8 (* 3 6)) 1)
...

它们都给出不同的结果, [12, 13,11] ,但内部部分(-8(* 3 6))相同,并且始终为 12 。我们的解决方案应该利用这一优势。

They all give different results, [12, 13, 11], but that inner part (- 8 (* 3 6)) is the same, and will always be 12. Our solution should look to take advantage of this.

对于任何需要扰流板的人,我已为分支 git.launchpad.net/~gz/+git/soq43837842/tree/binop_parts.py?h=initial rel = nofollow noreferrer>初始实现,它从顶部开始计算每个表达式,一个记录计算的次要更改和a最后一个预先计算结果在生成表达式时加上一些细微调整

For anyone in need of spoilers, I've put up branches for an initial implementation that calculates every expression from the top, a minor change that memoises the calculation, and a final one that precomputes results as the expressions are being generated plus some minor tweaks.


  • 17.40s过去了6180k max mem 来自问题的原始内容

  • 20.60 s超过了6284k max mem 没有问题得到评估

  • 4.65s过去了5356k max mem 我的初始

  • 2.71s过去了5316k max mem 我的记忆

  • 1.50s经历了5356k max mem 我的预先计算

  • 17.40s elapsed 6180k max mem original from question
  • 20.60s elapsed 6284k max mem without eval from question
  • 4.65s elapsed 5356k max mem my initial
  • 2.71s elapsed 5316k max mem my memoised
  • 1.50s elapsed 5356k max mem my precomputed

关于我的实现的一些说明。 generate()函数通过考虑字符串中的每个点并创建可能的下一个状态来创建候选表达式。例如,在开始时,都移动标记,并分割第一个数字:

Some notes on my implementation. The generate() function creates the candidate expressions by considering each point in the string and creating the possible next states. For example, at the start, both move the marker along, and split off the first number:

'3|456237490' ->
    '34|56237490' -> ...
    3 '4|56237490' ->

每个待处理状态都被推送到列表中,并且每次通过时弹出当前要考虑的状态循环。从状态的结尾继续,接下来的可能性是再次移动标记,并拆分数字以制成三个表达式之一。

Each pending state is pushed to a list, and the current one to consider is popped off each time through the loop. Continuing from the state at the end, the next possibilities are moving the marker along again, and splitting a number to make one of the three expressions.

        3 '45|6237490' -> ...
        (* 3 4) '5|6237490' -> ...
        (+ 3 4) '5|6237490' -> ...
        (- 3 4) '5|6237490' -> ...

到目前为止,我已经克服了皱纹,并获得了操作员的优先权。处理乘法时,我们可能需要重写现有的表达式。考虑:

I have glossed over one wrinkle with operator precedence so far. When handling multiplication, we may need to rewrite an existing expression. Consider:

(+ 1 2) '3|' ->
    (* (+ 1 2) 3) '' # ???
    (+ (+ 1 2) 3) ''
    (- (+ 1 2) 3) ''

对于加法和减法没关系,顺序无关紧要。但是, 2 * 3 必须在 1 + ... 之前发生。简而言之,我们需要在内部进行乘法运算:

For addition and subtraction this is fine, order won't matter. However, 2 * 3 has to happen before 1 + .... In short, we need to push the multiplication inside:

(+ 1 2) 3 -> (+ 1 (* 2 3))

有一些整齐的方法可以存储更多内容有关您的操作的信息,而不仅仅是执行它们的功能。对于这个问题,这并不是真正需要的,也不是其他可能的转换,例如结合多个表达式或排除不相关的部分。

There are neat ways to handle this by storing a bit more information about your operations beyond just the function to execute them. For this problem that's not really required, nor are other possible transformations like combining multiple expressions or factoring out irrelevant parts.

最后的实现说明,我很难做到迭代的方向和(最初)表达式的布局向后。

Final implementation note, just to be difficult I made both the direction of iteration and (initially) the layout of the expressions backwards.

这篇关于高效算法,可针对特定目标组成有效的表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆