O(N)排列的标识 [英] O(N) Identification of Permutations

查看:88
本文介绍了O(N)排列的标识的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此答案通过比较两个字符串的内容来确定两个字符串是否是置换.如果每个字符包含相同数量的字符,则显然是排列.这是在 O(N)时间完成的.

This answer determines if two strings are permutations by comparing their contents. If they contain the same number of each character, they are obviously permutations. This is accomplished in O(N) time.

我不喜欢这个答案,因为它重新发明了 is_permutation 被设计用来做.也就是说,is_permutation的复杂度为:

I don't like the answer though because it reinvents what is_permutation is designed to do. That said, is_permutation has a complexity of:

谓词的

至多 O(N 2 )个应用,如果序列已经相等,则精确为 N ,其中N=std::distance(first1, last1)

At most O(N2) applications of the predicate, or exactly N if the sequences are already equal, where N=std::distance(first1, last1)

因此,我不能提倡使用is_permutation,它比手动旋转算法慢几个数量级.但是可以肯定的是,该标准的实施者不会错过如此明显的改进吗?那么为什么is_permutation O(N 2 )呢?

So I cannot advocate the use of is_permutation where it is orders of magnitude slower than a hand-spun algorithm. But surely the implementer of the standard would not miss such an obvious improvement? So why is is_permutation O(N2)?

推荐答案

是我写了这个答案.

当字符串的value_typechar时,查找表中所需的元素数为256.对于两字节编码,为65536.对于四字节编码,查找表将刚好超过4十亿个条目,可能的大小为16 GB!而且其中大部分都没有使用.

When the string's value_type is char, the number of elements required in a lookup table is 256. For a two-byte encoding, 65536. For a four-byte encoding, the lookup table would have just over 4 billion entries, at a likely size of 16 GB! And most of it would be unused.

因此,第一件事是要认识到,即使我们将类型限制为charwchar_t,它们仍然可能站不住脚.同样,如果我们要对类型为int的序列执行is_permutation.

So the first thing is to recognize that even if we restrict the types to char and wchar_t, it may still be untenable. Likewise if we want to do is_permutation on sequences of type int.

对于大小为1或2个字节的整数类型,我们可以使用std::is_permutation<>的特殊化.但这有点让人联想到std::vector<bool>,并不是每个人都回想起来是个好主意.

We could have a specialization of std::is_permutation<> for integral types of size 1 or 2 bytes. But this is somewhat reminiscent of std::vector<bool> which not everyone thinks was a good idea in retrospect.

我们还可以使用基于std::map<T, size_t>的查找表,但这很可能会占用大量内存,因此它可能不是性能上的胜利(或者至少并非总是如此).不过,可能需要实施一个进行详细的比较.

We could also use a lookup table based on std::map<T, size_t>, but this is likely to be allocation-heavy so it might not be a performance win (or at least, not always). It might be worth implementing one for a detailed comparison though.

总而言之,我对C ++标准没有错,因为它不为char包括高性能版本的is_permutation.首先是因为我不确定在现实世界中模板的最普遍用途,其次是因为STL并不是算法的全部和全部,尤其是在领域知识可用于加速特殊算法的情况下案例.

In summary, I don't fault the C++ standard for not including a high-performance version of is_permutation for char. First because in the real world I'm not sure it's the most common use of the template, and second because the STL is not the be-all and end-all of algorithms, especially where domain knowledge can be used to accelerate computation for special cases.

如果事实证明charis_permutation在野外很普遍,那么C ++库实现者将有权为其提供专门化.

If it turns out that is_permutation for char is quite common in the wild, C++ library implementors would be within their rights to provide a specialization for it.

这篇关于O(N)排列的标识的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆