分割字符串的所有方法 [英] All ways to partition a string
问题描述
我正在尝试找到一种有效的算法来获得对字符串进行分区的所有方法
I'm trying to find a efficient algorithm to get all ways to partition a string
例如对于给定的字符串 'abcd' =>
'a' 'bcd'
'a' 'b' 'cd'
'a' 'b' 'c' 'd'
'ab' 'cd'
'ab' 'c' 'd'
'abc' 'd'
'a', 'bc', 'd
eg for a given string 'abcd' =>
'a' 'bcd'
'a' 'b' 'cd'
'a' 'b' 'c' 'd'
'ab' 'cd'
'ab' 'c' 'd'
'abc' 'd'
'a', 'bc', 'd
任何语言都将不胜感激
提前致谢!
推荐答案
问题分析
在每对相邻的字符之间,您可以决定是否剪切.对于长度为 n 的字符串,有 n-1 个位置可以切割或不切割,即有两种可能性.因此,长度为 n 的字符串可以按 2n-1 种方式进行分区.
Between each pair of adjacent characters, you can decide whether to cut. For a string of size n, there are n-1 positions where you can cut or not, i.e. there are two possibilities. Therefore a string of size n can be partitioned in 2n-1 ways.
输出由 2n-1 个分区组成,每个分区有 n 个字符和分隔符.所以我们可以将输出大小描述为 f(n) = 2n-1 * n + s(n) 其中 s(n) ≥ 0 占分区分隔符和行分隔符.
The output consists of 2n-1 partitions, each having n characters plus separators. So we can describe the output size as f(n) = 2n-1 * n + s(n) where s(n) ≥ 0 accounts for the partition separators and line separators.
因此仅由于输出大小,解决此问题的算法必须具有指数运行时间或更糟:Ω(2n).
So due to the output size alone, an algorithm solving this problem must have exponential runtime or worse: Ω(2n).
(0 ≤ c * 2n = ½ * 2n = 2n-1 ≤ 2n-1 * n ≤ f(n) 对于所有具有正常数的 n≥k c=½, k=1)
(0 ≤ c * 2n = ½ * 2n = 2n-1 ≤ 2n-1 * n ≤ f(n) for all n≥k with positive constants c=½, k=1)
解决方案
我选择将分区表示为整数.cutpoints
中的每一位决定是否在字符i
和i+1
之间进行剪切.要遍历所有可能的分区,我们只需要遍历 0
和 2^(n-1) - 1
之间的所有整数.
I chose to represent a partition as integer. Each bit in cutpoints
determines whether to cut between characters i
and i+1
. To iterate through all possible partitions, we just need to go trough all integers between 0
and 2^(n-1) - 1
.
示例:对于长度为 4 的字符串,我们遍历 0
和 2^3 - 1
或 0
和 7
或二进制:000
和 111
.
Example: For a string of length 4, we go through all integers between 0
and 2^3 - 1
or 0
and 7
or in binary: 000
and 111
.
# (python 2 or 3)
def all_partitions(string):
for cutpoints in range(1 << (len(string)-1)):
result = []
lastcut = 0
for i in range(len(string)-1):
if (1<<i) & cutpoints != 0:
result.append(string[lastcut:(i+1)])
lastcut = i+1
result.append(string[lastcut:])
yield result
for partition in all_partitions("abcd"):
print(partition)
内存使用:
我认为我的解决方案在 Python 3 中使用 O(n)
内存.一次只生成一个分区,它被打印出来并且不再被引用.当然,如果您保留所有结果,例如通过将它们存储在列表中.
I think my solution uses O(n)
memory with Python 3. Only one partition is generated at a time, it's printed and not referenced anymore. This changes of course, if you keep all results, e.g. by storing them in a list.
在 Python 2 中将 range
替换为 xrange
,否则所有可能的 cutpoints
将存储在一个列表中,因此需要指数级的内存.
In Python 2 replace range
with xrange
, otherwise all possible cutpoints
will be stored in a list, therefore needing an exponential amount of memory.
JavaScript 解决方案
// ES6 generator
function* all_partitions(string) {
for (var cutpoints = 0; cutpoints < (1 << (string.length - 1)); cutpoints++) {
var result = [];
var lastcut = 0;
for (var i = 0; i < string.length - 1; i++) {
if (((1 << i) & cutpoints) !== 0) {
result.push(string.slice(lastcut, i + 1));
lastcut = i + 1;
}
}
result.push(string.slice(lastcut));
yield result;
}
}
for (var partition of all_partitions("abcd")) {
console.log(partition);
}
使用 NodeJS v4.4.3 测试(免责声明:我之前没有使用过 NodeJS).
Tested with NodeJS v4.4.3 (disclaimer: I have not used NodeJS before).
这篇关于分割字符串的所有方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!