字符串排列等级+数据结构 [英] String permutations rank + data structure

查看:67
本文介绍了字符串排列等级+数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

眼前的问题是:

给出一个字符串.在排序的所有排列中告诉其排名 从字典上看.

Given a string. Tell its rank among all its permutations sorted lexicographically.

可以通过数学方式尝试这个问题,但是我想知道是否还有其他算法可以计算出来?

The question can be attempted mathematically, but I was wondering if there was some other algorithmic method to calculate it ?

此外,如果我们必须按顺序存储所有字符串排列,我们如何有效地生成(以及复杂度如何).用于存储排列的良好数据结构是什么,并且对于检索也是有效的?

Also if we have to store all the string permutations rankwise , how can we generate them efficiently (and what would be the complexity) . What would be a good data structure for storing the permutations and which is also efficient for retrieval?

编辑

感谢排列生成部分的详细答案,有人可以建议一个好的数据结构吗?我只能想起特里树.

Thanks for the detailed answers on the permutations generation part, could someone also suggest a good data structure? I have only been able to think of trie tree.

推荐答案

有一种O(n |Σ|)算法可在其排列列表中找到长度为n的字符串的行列.此处,Σ是字母.

There is an O(n|Σ|) algorithm to find the rank of a string of length n in the list of its permutations. Here, Σ is the alphabet.

排名低于 s 的每个排列都可以 pcx 形式唯一地写入;其中:

Every permutation which is ranked below s can be written uniquely in the form pcx; where:

  • p s
  • 的适当前缀
  • c 是在 s 中紧随 p 出现的字符之后的字符.并且 c 也是 s 中未包含在 p 中的字符.
  • x s 中出现的其余字符的任意排列;即不包含在 p c 中.
  • p is a proper prefix of s
  • c is a character ranked below the character appearing just after p in s. And c is also a character occurring in the part of s not included in p.
  • x is any permutation of the remaining characters occurring in s; i.e. not included in p or c.

我们可以通过以递增的长度顺序遍历 s 的每个前缀,同时保持出现在其余部分中的字符的频率,来计数每个类中包含的排列s ,以及 x 表示的排列数量.详细信息留给读者.

We can count the permutations included in each of these classes by iterating through each prefix of s in increasing order of length, while maintaining the frequency of the characters appearing in the remaining part of s, as well as the number of permutations x represents. The details are left to the reader.

这是假设所涉及的算术运算需要恒定的时间;它不会;因为涉及的数字可以为nlog |Σ|数字.考虑到这一点,该算法将以O(n 2 log |Σ| log(nlog |Σ|))运行.由于我们可以在O(dlogd)中对两个d位数字进行加,减,乘和除运算.

This is assuming the arithmetic operations involved take constant time; which it wont; since the numbers involved can have nlog|Σ| digits. With this consideration, the algorithm will run in O(n2 log|Σ| log(nlog|Σ|)). Since we can add, subtract, multiply and divide two d-digit numbers in O(dlogd).

typedef long long int lli;

lli rank(string s){
    int n = s.length();

    vector<lli> factorial(n+1,1);
    for(int i = 1; i <= n; i++)
        factorial[i] = i * factorial[i-1];

    vector<int> freq(26);
    lli den = 1;
    lli ret = 0;
    for(int i = n-1; i >= 0; i--){
        int si = s[i]-'a';
        freq[si]++;
        den *= freq[si];
        for(int c = 0; c < si; c++) 
            if(freq[c] > 0) 
                ret += factorial[n-i-1] / (den / freq[c]);
    }
    return ret + 1;
}

这篇关于字符串排列等级+数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆