用于检查字符串是否由子字符串列表构建的算法 [英] Algorithm for checking if a string was built from a list of substrings

查看:134
本文介绍了用于检查字符串是否由子字符串列表构建的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将为您提供一个字符串和一个字符串数组。

You are given a string and an array of strings. How to quickly check, if this string can be built by concatenating some of the strings in the array?

这是一个理论上的问题,实用上我不需要它,如何快速检查是否可以通过将数组中的某些字符串串联来构建此字符串?原因。但是我想知道,是否有一些好的算法。

This is a theoretical question, I don't need it for practical reasons. But I would like to know, if there is some good algorithm for this.

EDIT
阅读我注意到的一些答案,这可能是NP完全问题。甚至找到一个字符串子集,它们的长度也一样,因为给定的字符串是一个经典的子集和问题。

EDIT Reading some answer I have noticed, that this is probably NP-Complete problem. Even finding a subset of strings, that will together have same length, as a given string is a classic subset sum problem.

所以我想对此没有简单的答案。

So I guess there is no easy answer to this.

编辑

现在看来,它不是NP-毕竟是完整的问题。这样比较酷:-)

Now it seems, that it is not a NP-Complete problem after all. That's way cooler :-)

编辑

我想出了一个通过一些测试的解决方案:

I have came up with a solution that passes some tests:

def can_build_from_substrings(string, substrings):
    prefixes = [True] + [False] * (len(string) - 1)
    while True:
        old = list(prefixes)
        for s in substrings:
            for index, is_set in enumerate(prefixes):
                if is_set and string[index:].startswith(s):
                    if string[index:] == s:
                        return True
                    prefixes[index + len(s)] = True
        if old == prefixes: # nothing has changed in this iteration
            return False

我相信时间是 O(n * m ^ 3),其中 n 子字符串 m 的长度为 string 。您如何看待?

I believe the time is O(n * m^3), where n is length of substrings and m is length of string. What do you think?

推荐答案

注意:这里我假定您可以多次使用每个子字符串。您可以通过更改定义子问题的方式来概括解决方案以包括此限制。这将对空间以及预期的运行时间产生负面影响,但是问题仍然是多项式。

这是一个动态编程问题。 (还有一个很好的问题!)

This is a dynamic programming problem. (And a great question!)

让我们将 composable(S,W)定义为true,如果字符串 S 可以使用子字符串列表 W 来写。

Let's define composable(S, W) to be true if the string S can be written using the list of substrings W.

S 是可组合的,并且仅当以下条件:

S is composable if and only if:


  1. S W 中的子字符串 w 开头。

  2. w 之后的 S 其余部分也是可组合的。

  1. S starts with a substring w in W.
  2. The remainder of S after w is also composable.

让我们写一些伪代码:

COMPOSABLE(S, W):
  return TRUE if S = "" # Base case
  return memo[S] if memo[S]

  memo[S] = false

  for w in W:
    length <- LENGTH(w)
    start  <- S[1..length]
    rest   <- S[length+1..-1]
    if start = w AND COMPOSABLE(rest, W) :
      memo[S] = true # Memoize

  return memo[S]

该算法具有O(m * n)运行时间,假设子字符串的长度不相对于字符串本身的线性w / r / t,在这种情况下,运行时间为O(m * n ^ 2)(其中m是子字符串列表的大小,n是所讨论的字符串的长度)。它使用O(n)空间进行记忆。

This algorithm has O(m*n) runtime, assuming the length of the substrings is not linear w/r/t to the string itself, in which case runtime would be O(m*n^2) (where m is the size of the substring list and n is the length of the string in question). It uses O(n) space for memoization.

(注意,伪代码使用O(n ^ 2)空间,但是对记忆键进行哈希处理可以缓解这种情况。)

(N.B. as written the pseudocode uses O(n^2) space, but hashing the memoization keys would alleviate this.)

编辑

这是一个有效的Ruby实现:

Here is a working Ruby implementation:

def composable(str, words)
  composable_aux(str, words, {})
end

def composable_aux(str, words, memo)
  return true if str == ""                # The base case
  return memo[str] unless memo[str].nil?  # Return the answer if we already know it

  memo[str] = false              # Assume the answer is `false`

  words.each do |word|           # For each word in the list:
    length = word.length
    start  = str[0..length-1]
    rest   = str[length..-1]

    # If the test string starts with this word,
    # and the remaining part of the test string
    # is also composable, the answer is true.
    if start == word and composable_aux(rest, words, memo)
      memo[str] = true           # Mark the answer as true
    end
  end

  memo[str]                      # Return the answer
end

这篇关于用于检查字符串是否由子字符串列表构建的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆