在进程内存中缓存巨量数据 [英] Caching huge data in Process memory
问题描述
我在金融业工作。我们要推出数据库命中数据处理。这是非常昂贵的。所以我们计划去按需缓存逻辑。 [运行时插入&运行时查找]
是否有人在超过1000万条记录的实现缓存逻辑?每条记录约有160 - 200个字节。
我用不同的方法遇到了以下缺点。
- 无法使用 stl std :: map 实现密钥库缓存注册表。
插入和查找在200000条记录后非常慢。 - 共享内存或内存映射文件是缓存数据的开销,
是因为这些数据不是在进程间共享的 - 使用 sqlite3 内存中& flatfile应用数据库可以
值。但它也有缓慢的查找后2-3百万的记录。 - 处理内存可能对自己的内核内存消耗有一些限制。我的
假设是在32位机器& 64位机器上的4千兆位。
如果您遇到这个问题并以任何方式解决,请向我建议。 >
感谢
如果缓存是一个简单的键值存储,不应使用 std :: map
,它具有 (日志 std :: map
如果你需要排序。
听起来性能就是你之后,因此您可能需要查看 Boost Intrusive 。您可以轻松地结合 unordered_map
和列表
来创建高效LRU。
I am working in Finance Industry. We want to roll out Database hit for data processing. It is very costly. So we are planning to go for on-demand cache logic. [ runtime insert & runtime lookup ]
Is anyone worked in implementation of Caching logic for more than 10 million of records?. Per record is say about 160 - 200 bytes.
I faced following disadvantages with different approach.
- Can not use stl std::map to implement a key base cache registry. The insert and lookup is very slow after 200000 records.
- Shared memory or memory mapped files are kind of overhead for caching data, because these data are not shared across the processes
- Use of sqlite3 in-memory & flatfile application database can be worth. But it too have slow lookup after a 2-3 million of records.
- Process memory might have some limitation on its own kernel memory consumption. my assumption is 2 gig on 32 bit machine & 4 gig on 64 bit machine.
Please suggest me something if you had come across this problem and solved by any means.
Thanks
If your cache is a simple key-value store, you should not be using std::map
, which has O(log n) lookup, but std::unordered_map
, which has O(1) lookup. You should only use std::map
if you require sorting.
It sounds like performance is what you're after, so you might want to look at Boost Intrusive. You can easily combine unordered_map
and list
to create a high-efficiency LRU.
这篇关于在进程内存中缓存巨量数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!