不适合在内存中的随机访问容器? [英] Random-access container that does not fit in memory?

查看:97
本文介绍了不适合在内存中的随机访问容器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个对象(比如说,图片)的数组,它太大了,不适合内存(例如40GB)。但我的代码需要能够在运行时随机访问这些对象。



这是最好的方法是什么?



从我的代码的角度来看,当然,如果一些数据在磁盘上或临时存储在内存中,它应该有透明的访问:

  container.getObject(1242) - > process 
container.getObject(479431) - > process();

但是如何实现这个容器呢?应该只是将请求发送到数据库?如果是这样,哪一个会是最好的选择? (如果是数据库,那么它应该是免费的,而不是太多的管理麻烦,也许Berkeley DB或sqlite?)



我应该自己实现,砂清理内存满了吗?



对容器的要求是它最小化了磁盘访问(一些元素可能被我的

更新:我发现STXXL不适用于我的问题因为我在容器中存储的对象具有动态大小,即我的代码可能在运行时更新它们(增加或减少一些对象的大小)。但STXXL不能处理:


STXXL容器假定它们存储的数据
类型是普通的旧数据
类型(荚)。
http://algo2.iti.kit.edu/dementiev /stxxl/report/node8.html


您能对其他解决方案发表评论吗?如何使用数据库?

$ p

解决方案

考虑使用 STXXL


STXXL的核心是C ++标准模板库的实现
b $ b STL外部存储器(外核)
计算,即STXXL实现
容器和算法,可以
处理大量的数据,只有
适合磁盘。虽然兼容性
到STL支持易用性和
兼容现有的
应用程序,另一个设计优先级
是高性能。



I have an array of objects (say, images), which is too large to fit into memory (e.g. 40GB). But my code needs to be able to randomly access these objects at runtime.

What is the best way to do this?

From my code's point of view, it shouldn't matter, of course, if some of the data is on disk or temporarily stored in memory; it should have transparent access:

container.getObject(1242)->process();
container.getObject(479431)->process();

But how should I implement this container? Should it just send the requests to a database? If so, which one would be the best option? (If a database, then it should be free and not too much administration hassle, maybe Berkeley DB or sqlite?)

Should I just implement it myself, memoizing objects after acces sand purging the memory when it's full? Or are there good libraries (C++) for this out there?

The requirements for the container would be that it minimizes disk access (some elements might be accessed more frequently by my code, so they should be kept in memory) and allows fast access.

UPDATE: I turns out that STXXL does not work for my problem because the objects I store in the container have dynamic size, i.e. my code may update them (increasing or decreasing the size of some objects) at runtime. But STXXL cannot handle that:

STXXL containers assume that the data types they store are plain old data types (POD). http://algo2.iti.kit.edu/dementiev/stxxl/report/node8.html

Could you please comment on other solutions? What about using a database? And which one?

解决方案

Consider using the STXXL:

The core of STXXL is an implementation of the C++ standard template library STL for external memory (out-of-core) computations, i.e., STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks. While the compatibility to the STL supports ease of use and compatibility with existing applications, another design priority is high performance.

这篇关于不适合在内存中的随机访问容器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆