排序动态大小的对象 [英] Sort objects of dynamic size

查看:115
本文介绍了排序动态大小的对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题



假设我有一个大数组的字节(最多4GB)包含一些数据。这些字节以这样的方式对应于不同的对象,即每个字节(最多32个)将构成单个对象。一个重要的事实是,这个大小 对于所有对象是相同的,而不是存储在对象本身中,并且在编译时是未知的。



目前,这些对象只是逻辑实体,而不是编程语言中的对象。我对这些对象进行了比较,其中包括大多数对象数据的词典比较,以及使用剩余数据打破关系的一些不同功能。现在我想有效地排序这些对象(这真的会是应用程序的瓶颈)。



到目前为止的想法



我想到了几种可能的方法来实现这一点,但每一个都似乎有一些不幸的后果。你不一定要读所有这些。 我尝试以粗体打印每种方法的核心问题。 如果要提出其中一种方法,还有相关问题。



1。 C快速排序



当然,C快速排序算法也可以在C ++应用程序中使用。它的签名几乎完美匹配我的要求。但是使用该函数将禁止比较函数的内联意味着每个比较都承载函数调用开销。我曾经希望有一种方法来避免。 关于如何C qsort_r 与性能相比,STL的任何经验都是非常受欢迎的。



2。使用指向数据的对象的间接方法



这将很容易写一堆保存指针到它们各自的数据的对象。然后可以排序。这里有两个方面要考虑。一方面,只是移动指针而不是所有的数据意味着更少的内存操作。另一方面,不移动对象可能会破坏存储器局部性,从而缓存性能。有机会更深层次的快速排序递归实际上可以访问他们的数据从几个缓存页面将几乎完全消失。相反,每个缓存的存储器页在被替换之前仅产生非常少的可用数据项。 如果任何人可以提供一些关于复制和内存区域之间权衡的经验,我会非常高兴。



3。自定义迭代器,引用和值对象



我写了一个类作为内存范围内的迭代器。解引用此迭代器不会产生引用,而是生成一个新构造的对象来保存指向数据的指针,以及在构建迭代器时给出的大小。所以这些对象可以进行比较,我甚至有一个 std :: swap 的实现。不幸的是, std :: swap 似乎不足以满足 std :: sort 。在进程的某些部分,我的gcc实现使用插入排序(在文件 stl_alog.h 中的 __ insertion_sort 中实现)其将值移出序列,将数个项目移动一步,然后将第一值移回到适当位置的序列:

  typename iterator_traits< _RandomAccessIterator> :: value_type 
__val = _GLIBCXX_MOVE(* __ i);
_GLIBCXX_MOVE_BACKWARD3(__ first,__i,__i + 1);
* __ first = _GLIBCXX_MOVE(__ val);

你知道一个标准排序实现,它不需要值类型,



所以我不仅需要我的类作为参考,但我还需要一个类来保存一个临时的值。由于我的对象的大小是动态的,我必须在堆上分配,这意味着内存分配在recusrion树的叶子。也许一个替代方法是具有静态大小的vaue类型,该静态大小应该足够大以容纳我当前打算支持的大小的对象。但这意味着在迭代器类的 reference_type value_type 之间的关系中会有更多的hackery 。这意味着我必须更新这个大小,我的应用程序,有一天支持更大的对象。 Ugly。



如果你能想到一个干净的方式来获取上述代码来操纵我的数据,而不必动态分配内存,这将是一个伟大的解决方案我已经使用C ++ 11功能,因此使用移动语义或类似的不会有问题。



4。自定义排序



我甚至考虑重新实现所有快速排序。也许我可以利用这样的事实,我的比较大多是一个词典比较,即我可以排序序列第一个字节,只有切换到下一个字节,当firt字节是相同的所有元素。我还没有解决这方面的细节,但如果任何人可以建议一个参考,一个实现,甚至一个规范名称用作这样的字节顺序词典排序的关键字,我会非常。我仍然不相信通过合理的努力,我可以击败STL模板实现的性能。



5。完全不同的算法



我知道有很多种排序算法。其中一些可能更适合我的问题。 排序首先在我的脑海中,但我还没有真正想到这一点。 如果您可以建议排序算法更适合我的问题,请这样做。



问题



所以基本上我的问题是这样的:

如何在堆内存中有效地排序动态大小的对象?



适用于我的情况是好的,不管是否与我自己的想法有关。对粗体标出的单个问题的答案或任何可能帮助我在我的替代方案之间做出决定的任何其他见解,将是有用的,特别是如果对单个方法没有明确的答案。

的调用, std :: sort 基于switch语句。每个调用都将进行内联和高度优化。



一些对象大小可能需要一个自定义迭代器,因为编译器将坚持填充本地对象以对齐地址边界。指针可以在其他情况下用作迭代器,因为指针具有迭代器的所有属性。


Problem

Suppose I have a large array of bytes (think up to 4GB) containing some data. These bytes correspond to distinct objects in such a way that every s bytes (think s up to 32) will constitute a single object. One important fact is that this size s is the same for all objects, not stored within the objects themselves, and not known at compile time.

At the moment, these objects are logical entities only, not objects in the programming language. I have a comparison on these objects which consists of a lexicographical comparison of most of the object data, with a bit of different functionality to break ties using the remaining data. Now I want to sort these objects efficiently (this is really going to be a bottleneck of the application).

Ideas so far

I've thought of several possible ways to achieve this, but each of them appears to have some rather unfortunate consequences. You don't necessarily have to read all of these. I tried to print the central question of each approach in bold. If you are going to suggest one of these approaches, then your answer should respond to the related questions as well.

1. C quicksort

Of course the C quicksort algorithm is available in C++ applications as well. Its signature matches my requirements almost perfectly. But the fact that using that function will prohibit inlining of the comparison function will mean that every comparison carries a function invocation overhead. I had hoped for a way to avoid that. Any experience about how C qsort_r compares to STL in terms of performance would be very welcome.

2. Indirection using Objects pointing at data

It would be easy to write a bunch of objects holding pointers to their respective data. Then one could sort those. There are two aspects to consider here. On the one hand, just moving around pointers instead of all the data would mean less memory operations. On the other hand, not moving the objects would probably break memory locality and thus cache performance. Chances that the deeper levels of quicksort recursion could actually access all their data from a few cache pages would vanish almost completely. Instead, each cached memory page would yield only very few usable data items before being replaced. If anyone could provide some experience about the tradeoff between copying and memory locality I'd be very glad.

3. Custom iterator, reference and value objects

I wrote a class which serves as an iterator over the memory range. Dereferencing this iterator yields not a reference but a newly constructed object to hold the pointer to the data and the size s which is given at construction of the iterator. So these objects can be compared, and I even have an implementation of std::swap for these. Unfortunately, it appears that std::swap isn't enough for std::sort. In some parts of the process, my gcc implementation uses insertion sort (as implemented in __insertion_sort in file stl_alog.h) which moves a value out of the sequence, moves a number items by one step, and then moves the first value back into the sequence at the appropriate position:

          typename iterator_traits<_RandomAccessIterator>::value_type
            __val = _GLIBCXX_MOVE(*__i);
          _GLIBCXX_MOVE_BACKWARD3(__first, __i, __i + 1);
          *__first = _GLIBCXX_MOVE(__val);

Do you know of a standard sorting implementation which doesn't require a value type but can operate with swaps alone?

So I'd not only need my class which serves as a reference, but I would also need a class to hold a temporary value. And as the size of my objects is dynamic, I'd have to allocate that on the heap, which means memory allocations at the very leafs of the recusrion tree. Perhaps one alternative would be a vaue type with a static size that should be large enough to hold objects of the sizes I currently intend to support. But that would mean that there would be even more hackery in the relation between the reference_type and the value_type of the iterator class. And it would mean I would have to update that size for my application to one day support larger objects. Ugly.

If you can think of a clean way to get the above code to manipulate my data without having to allocate memory dynamically, that would be a great solution. I'm using C++11 features already, so using move semantics or similar won't be a problem.

4. Custom sorting

I even considered reimplementing all of quicksort. Perhaps I could make use of the fact that my comparison is mostly a lexicographical compare, i.e. I could sort sequences by first byte and only switch to the next byte when the firt byte is the same for all elements. I haven't worked out the details on this yet, but if anyone can suggest a reference, an implementation or even a canonical name to be used as a keyword for such a byte-wise lexicographical sorting, I'd be very happy. I'm still not convinced that with reasonable effort on my part I could beat the performance of the STL template implementation.

5. Completely different algorithm

I know there are many many kinds of sorting algorithms out there. Some of them might be better suited to my problem. Radix sort comes to my mind first, but I haven't really thought this through yet. If you can suggest a sorting algorithm more suited to my problem, please do so. Preferrably with implementation, but even without.

Question

So basically my question is this:
"How would you efficiently sort objects of dynamic size in heap memory?"

Any answer to this question which is applicable to my situation is good, no matter whether it is related to my own ideas or not. Answers to the individual questions marked in bold, or any other insight which might help me decide between my alternatives, would be useful as well, particularly if no definite answer to a single approach turns up.

解决方案

Since there are only 31 different object variations (1 to 32 bytes), you could easily create an object type for each and select a call to std::sort based on a switch statement. Each call will get inlined and highly optimized.

Some object sizes might require a custom iterator, as the compiler will insist on padding native objects to align to address boundaries. Pointers can be used as iterators in the other cases since a pointer has all the properties of an iterator.

这篇关于排序动态大小的对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆