如何通过反复从字符串去除一些特定单词的所有情况的发生,最大限度地减少字符串的长度 [英] How to minimize a string's length by iteratively removing all occurrences of some specified words from the string

查看:127
本文介绍了如何通过反复从字符串去除一些特定单词的所有情况的发生,最大限度地减少字符串的长度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题出现在一个编程比赛,我们仍然不知道如何解决这个问题。

This question appeared in a programming contest and we still have no idea how to solve it.

问: 给定一个字符串S和串L的名单,我们希望保持删除子,可能是L的所有出现,我们必须最大限度地减少形成最终的字符串的长度。还要注意的是去除字符串可能会引发更多的清除。

Question: Given a string S and a list of strings L, we want to keep removing all occurences of substrings that may be in L. And we have to minimize the length of the final string formed. Also note that removal of a string may initiate more removals.

例如,

  1. S = ccdedefcde,L = {} CDE 然后回答= 1。因为我们可以通过ccdedefcde减少的S - > cdefcde - > fcde - > F
  2. S = aabaab,L = {AA,BB}然后回答= 0还原反应可以通过aabaab - > AABB - > AA - >空字符串
  3. S = acmmcacamapapc,L = {MCA,PA}然后回答= 6的还原反应可以通过acmmcacamapapc-> acmcamapapc - > acmapapc - > acmapc
  1. S=ccdedefcde, L={cde} then answer = 1. Because we can reduce S by ccdedefcde -> cdefcde -> fcde -> f.
  2. S=aabaab, L={aa, bb} then answer = 0 as reduction can be carried out by aabaab -> aabb -> aa -> ‘Empty String’
  3. S=acmmcacamapapc, L={mca, pa} then answer=6 as reduction can be carried out by acmmcacamapapc-> acmcamapapc -> acmapapc -> acmapc.

S的最大长度可以是50和列表L的最大长度可以是50。

The maximum length of S can be 50 and the maximum length of list L can be 50.

我的做法是一个基本的递归遍历中,我回到了最小长度,我可以通过去除不同的子串得到。不幸的是这种递归方法将超时在最坏的情况下输入,因为我们有50选择在每个步骤和递归深度是50

My approach is a basic recursive traversal in which I return the minimum length that I can get by removing different sub-strings. Unfortunately this recursive approach will time out in the worst case input as we have 50 options at each step and the recursion depth is 50.

请提出一个高效的算法,可以解决这个问题。

Please suggest an efficient algorithm that may solve this problem.

推荐答案

下面是一个多项式时间的算法,产生最佳的效果。由于它的方便对我来说,我将用多项式时间 CYK算法作为一个子程序,具体地,根据与加权制作一个上下文无关文法计算的字符串的最小重量解析扩展

Here's a polynomial-time algorithm that yields optimal results. Since it's convenient for me, I'm going to use the polynomial-time CYK algorithm as a subroutine, specifically the extension that computes a minimum-weight parse of a string according to a context-free grammar with weighted productions.

现在,我们只需要一个上下文无关文法形式化这个问题。开始符号是 A (通常取值,但是我们在拍摄的话),用下面的制作。

Now we just have to formalize this problem with a context-free grammar. The start symbol is A (usually S, but that's taken already), with the following productions.

A -> N      (weight 0)
A -> A C N  (weight 0)

我会解释 N 不久。如果 N C 是终端,然后 A 将接受正规语言 N(CN)* 。非终结符 C 匹配一个终端(字符)。

I'll explain N shortly. If N and C were terminals, then A would accept the regular language N (C N)*. The nonterminal C matches a single terminal (character).

C -> a  (weight 1)
C -> b  (weight 1)
C -> c  (weight 1)
...

非终结符 N 匹配字符串是的可空的,就是字符串可以通过删除字符串被减少为空字符串。基例是显而易见的。

The nonterminal N matches strings that are nullable, that is, strings that can be reduced to the empty string by deleting strings in L. The base case is obvious.

N ->                (weight 0)

我们也有生产的每个元素。当 L = {MCA,PA} ,例如,我们有以下的生产。

We also have a production for each element of L. When L = {mca, pa}, for example, we have the following productions.

N -> N m N c N a N  (weight 0)
N -> N p N a N      (weight 0)

我希望它的明确如何构造迭代清除和解析,在那里解析重量等于剩余字符串的长度之间的一对一的对应关系

I hope that it's clear how to construct the one-to-one correspondence between iterative removals and parses, where the parse weight is equal to the length of the residual string.

这篇关于如何通过反复从字符串去除一些特定单词的所有情况的发生,最大限度地减少字符串的长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆