字符串中回文子序列的总数 [英] Total number of palindromic subsequences in a string

查看:264
本文介绍了字符串中回文子序列的总数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是这样的-

对于作为输入的每个字符串,您需要告诉它作为回文的子序列的数量(不一定是不同)。请注意,空字符串不是回文。
例如, aab的回文序列是:

For every string given as input, you need to tell the number of subsequences of it that are palindromes (need not necessarily be distinct). Note that the empty string is not a palindrome. For example, the palindromic subsequences of "aab" are:

a, a, b, aa和方法返回4。

"a", "a", "b", "aa", and the method returns 4.

我想到了动态编程解决方案,以寻找最长回文序列,因此尝试从中汲取灵感。无法真正获得解决方案。可能甚至不需要动态编程。请提出建议。

I had the Dynamic Programming solution to finding Longest Palindromic Subsequence in mind and therefore tried to take ideas from it. Couldn't really get the solution. May be dynamic programming is not even required. Suggestions please.

还有一个问题。当条件不必一定是不同的被消除时,我们仍然可以算数而不实际生成所有回文子序列吗?

And there is one more catch. When the condition "need not necessarily be distinct" is removed, can we still count without actually generating all the palindromic subsequences?

推荐答案

我现在看到如何将求解时间缩短到 O(n ^ 2)。如果将其作为踏脚石的有趣之处,我将保留其他答案。注意:(这也是)仅解决问题的第一部分。我看不到仅有效地计算 distinct 回文子序列(PS)的方法。

I now see how to drop the solution time down to O(n^2). I'll leave my other answer up in case it's interesting as a stepping-stone to this one. Note: This is (also) only a solution to the first part of the problem; I see no way to efficiently count only distinct palindromic subsequences (PS).

而不是精确计算开始和结束的PS数量i和j的位置,让我们计算从i的之后到j的之前的位置。称它为g(i,j)。

Instead of counting the number of PS that begin and end at exactly the positions i and j, let's count how many begin at or after i and end at or before j. Call this g(i, j).

我们可以尝试写g(i,j)= g(i,j-1)+ g(i + 1)当j> i时,j)+(x [i] == x [j])* g(i + 1,j-1)。但这并不是很有效,因为前两个项将对在之后 i和在之前 j结束的所有PS进行重复计数。

We can try to write g(i, j) = g(i, j-1) + g(i+1, j) + (x[i] == x[j])*g(i+1, j-1) for the case when j > i. But this doesn't quite work, because the first two terms will double-count any PS that begin after i and end before j.

关键见解是注意到我们可以通过减去g()的其他值,并可能加上来轻松计算在某个确切位置处开始或结束的PS数量。再返回g()的更多值以补偿重复计数。例如,开始于完全 i且结束于精确 j的PS的数量为g(i,j)-g(i + 1,j)-g( i,j-1)+ g(i + 1,j-1):最后一项修正了以下事实:第二项和第三项都对所有以开头的g(i + 1,j-1)PS进行计数在之后并在之前 j。

The key insight is to notice that we can easily calculate the number of PS that begin or end at some exact position by subtracting off other values of g(), and perhaps adding yet more values of g() back on to compensate for double-counting. For example, the number of PS that begin at exactly i and end at exactly j is g(i, j) - g(i+1, j) - g(i, j-1) + g(i+1, j-1): the last term corrects for the fact that both the second and third terms count all g(i+1, j-1) PS that begin after i and end before j.

每个在i或之后开始并在j之前或之前结束的PS 4个类别中的1个:

Every PS that begins at or after i and ends at or before j is in exactly 1 of 4 categories:


  1. 它在i之后开始,在j之前结束。

  2. 它开始

  3. 它从i开始,到j结束。

  4. 它从i开始,到j结束。

  1. It begins after i, and ends before j.
  2. It begins at i, and ends before j.
  3. It begins after i, and ends at j.
  4. It begins at i, and ends at j.

g(i + 1,j)计数类别1或3中的所有PS,而g(i,j-1)计数类别1或2中的所有PS,因此它们的总和g(i + 1,j)+ g(i,j-1)对类别2或3中的所有PS进行一次计数,对类别1中的所有PS进行两次计数。由于g(i + 1,j-1)仅计算类别1中的所有PS,因此将其减去即可得到g(i + 1,j)+ g(i,j-1)-g(i + 1,j- 1)给出类别1、2和3中的PS总数。其余PS属于类别4中的PS。如果x [i]!= x [j],则该类别中不存在PS;否则,与在i + 1或i + 1之后开始并在j-1或j-1之前结束的PS一样多,即g(i + 1,j-1),再加上2个字符的序列x [i] x [j]。

g(i+1, j) counts all PS in category 1 or 3, and g(i, j-1) counts all PS in category 1 or 2, so their sum g(i+1, j) + g(i, j-1) counts all PS in category 2 or 3 once each, and all PS in category 1 twice. Since g(i+1, j-1) counts all PS in category 1 only, subtracting this off to get g(i+1, j) + g(i, j-1) - g(i+1, j-1) gives the total number of PS in category 1, 2 and 3. The remaining PS are those in category 4. If x[i] != x[j] then there are no PS in this category; otherwise, there are exactly as many as there are PS that begin at or after i+1 and end at or before j-1, namely g(i+1, j-1), plus one extra for the 2-character sequence x[i]x[j].

有了这个,我们可以用以下方式表达g()将二次情况从f()更改为恒定时间:

With this in hand, we can express g() in a way that changes the quadratic case from f() to constant time:

g(i, i) = 1 (i.e. when j = i)
g(i, i+1) = 2 + (x[i] == x[i+1]) (i.e. 3 iff adjacent chars are identical, otherwise 2)
g(i, j) = 0 when j < i (this new boundary case is needed)
g(i, j) = g(i+1, j) + g(i, j-1) - g(i+1, j-1) + (x[i] == x[j])*(g(i+1, j-1)+1) when j >= i+2

现在的最终答案就是g(1,n)。

The final answer is now simply g(1, n).

这篇关于字符串中回文子序列的总数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆