查找与最大长度的字符串的所有有序序列 [英] Find all ordered sequences with maximum length in strings

查看:196
本文介绍了查找与最大长度的字符串的所有有序序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题需要解决: 有任意长度的两个字符串任意内容。我需要找到与最大长度,它出现在两个字符串中所有的有序序列。

I have following problem to solve: There are two strings of arbitrary length with arbitrary content. I need to find all ordered sequences with maximum length, which appears in both strings.

实施例1: 输入:A1B2C31a2b3c 输出:123,12C,1B31BC,A23A2CAB3ABC

Example 1: input: "a1b2c3" "1a2b3c" output: "123" "12c" "1b3" "1bc" "a23" "a2c" "ab3" "abc"

实施例2: 输入:农发行ABCD 输出:AB,广告,CD

Example 2: input: "cadb" "abcd" output: "ab" "ad" "cd"

我直的方式写了两个循环,递归,然后删除重复和成果,这是更大的结果的一部分(如ABC序列中包含AB,交流和BC序列,所以我过滤那些)

I wrote it in straight way with two loops, recursion, then removing duplicates and results which are part of larger result (for instance "abc" sequence contains "ab" "ac" and "bc" sequences, so I am filtering those)

// "match" argument here used as temporary buffer
void match_recursive(set<string> &matches, string &match, const string &a_str1, const string &a_str2, size_t a_pos1, size_t a_pos2)
{
    bool added = false;

    for(size_t i = a_pos1; i < a_str1.length(); ++i)
    {
        for(size_t j = a_pos2; j < a_str2.length(); ++j)
        {
            if(a_str1[i] == a_str2[j])
            {
                match.push_back(a_str1[i]);

                if(i < a_str1.length() - 1 && j < a_str2.length() - 1)
                    match_recursive(matches, match, a_str1, a_str2, i + 1, j + 1);
                else
                    matches.emplace(match);
                added = true;

                match.pop_back();
            }
        }
    }

    if(!added)
        matches.emplace(match);
}

此功能解决了问题,但复杂性是不可接受的。为0q0e0t0c0a0d0a0d0i0e0o0p0z0实例解决方案0w0r0y0d0s0a0b0w0k0f0.0k0x0我的机器上等于28秒(调试目标,但无论如何,这是非常慢)。我觉得应该有一些简单的算法,对于这个问题,但不知何故,我找不到任何在网络上。

This function solves problem, but complexity is unacceptable. For instance solution for "0q0e0t0c0a0d0a0d0i0e0o0p0z0" "0w0r0y0d0s0a0b0w0k0f0.0k0x0" takes 28 seconds on my machine (debug target, but anyway this is extremely slow). I think there should be some simple algorithm for this problem, but somehow I can't find any on the net.

你们可以点我朝着正确的方向?

Can you guys point me to right direction?

推荐答案

下面是$ C $下的动态编程解决方案。我用你给的例子进行测试。我已经解决了这个问题的LCS,但是这是第一次全部打印出来。

Here is the code for the dynamic programming solution. I test it with the examples you give. I have solved the LCS problem, but this is the first time to print them all.

#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <string>
#include <set>

using namespace std;

#define MAX_LENGTH 100

int lcs(const char* a, const char* b)
{
    int row = strlen(a)+ 1;
    int column = strlen(b) + 1;

    //Memoization lower the function's time cost in exchange for space cost.
    int **matrix = (int**)malloc(sizeof(int*) * row);
    int i, j;
    for(i = 0; i < row; ++i)
        matrix[i] = (int*)calloc(sizeof(int), column);
    typedef set<string> lcs_set;

    lcs_set s_matrix[MAX_LENGTH][MAX_LENGTH];

    //initiate
    for(i = 0; i < MAX_LENGTH ; ++i)
        s_matrix[0][i].insert("");
    for(i = 0; i < MAX_LENGTH ; ++i)
        s_matrix[i][0].insert("");

    //Bottom up calculation
    for(i = 1; i < row; ++i)
    {
        for(j = 1; j < column; ++j)
        {
            if(a[i - 1] == b[j - 1])
            {
                matrix[i][j] = matrix[i -1][j - 1] + 1;
                // if your compiler support c++ 11, you can simplify this code.
                for(lcs_set::iterator it = s_matrix[i - 1][j - 1].begin(); it != s_matrix[i - 1][j - 1].end(); ++it)
                    s_matrix[i][j].insert(*it + a[i - 1]);
            }
            else
            {
                if(matrix[i][j - 1] > matrix[i - 1][j])
                {
                    matrix[i][j] = matrix[i][j - 1];
                    for(lcs_set::iterator it = s_matrix[i][j - 1].begin(); it != s_matrix[i][j - 1].end(); ++it)
                        s_matrix[i][j].insert(*it);
                }
                else if(matrix[i][j - 1] == matrix[i - 1][j])
                {
                    matrix[i][j] = matrix[i][j - 1];
                    for(lcs_set::iterator it = s_matrix[i][j - 1].begin(); it != s_matrix[i][j - 1].end(); ++it)
                        s_matrix[i][j].insert(*it);
                    for(lcs_set::iterator it = s_matrix[i - 1][j].begin(); it != s_matrix[i - 1][j].end(); ++it)
                        s_matrix[i][j].insert(*it);
                }
                else
                {
                    matrix[i][j] = matrix[i - 1][j];
                    for(lcs_set::iterator it = s_matrix[i - 1][j].begin(); it != s_matrix[i - 1][j].end(); ++it)
                        s_matrix[i][j].insert(*it);
                }

            }
        }
    }
    int lcs_length = matrix[row - 1][column -1];
    // all ordered sequences with maximum length are here.
    lcs_set result_set;

    int m, n;
    for(m = 1; m < row; ++m)
    {
        for(n = 1; n < column; ++n)
        {
            if(matrix[m][n] == lcs_length)
            {
                for(lcs_set::iterator it = s_matrix[m][n].begin(); it != s_matrix[m][n].end(); ++it)
                    result_set.insert(*it);
            }
        }
    }

    //comment it
    for(lcs_set::iterator it = result_set.begin(); it != result_set.end(); ++it)
        printf("%s\t", it->c_str());
    printf("\n");

    for(i = 0; i < row; ++i)
        free(matrix[i]);
    free(matrix);

    return lcs_length;
}

int main()
{
    char buf1[MAX_LENGTH], buf2[MAX_LENGTH];
    while(scanf("%s %s", buf1, buf2) != EOF)
    {
        printf("length is: %d\n", lcs(buf1, buf2) );
    }
    return 0;
} 

这篇关于查找与最大长度的字符串的所有有序序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆