Redis — 存储大地图(字典)的最佳方式 [英] Redis — best way to store a large map (dictionary)
问题描述
我需要做的是存储一对一的映射.数据集由大量同类(10M+)的键值对组成.例如,可以使用 Java 中 HashMap 对象的单个实例来存储此类数据.
What I need to do is to store a one-to-one mapping. The dataset consists of a large number of key-value pairs of the same kind (10M+). For example, one could use a single instance of HashMap object in Java for storing such data.
第一种方法是存储大量的键值对,如下所示:
The first way to do this is to store lots of key-value pairs, like this:
SET map:key1 value1
...
SET map:key900000 value900000
GET map:key1
第二个选项是使用单个哈希":
The second option is to use a single "Hash":
HSET map key1 value
...
HSET map key900000 value900000
HGET map key1
Redis Hashes 有一些方便的命令(HMSET
、HMGET
、HGETALL
等),并且不会污染键空间,所以这看起来是一个更好的选择.但是,在使用这种方法时是否有任何性能或内存方面的考虑?
Redis Hashes have some convenient commands (HMSET
, HMGET
, HGETALL
, etc.), and they don't pollute the keyspace, so this looks like a better option. However, are there any performance or memory considerations when using this approach?
推荐答案
是的,正如 Itamar Haber 所说,你应该看看这个 redis 内存优化指南.但您还应该记住以下几点:
Yes, as Itamar Haber says, you should look at this redis memory optimization guide. But you should also keep in mind a few more things:
- 更喜欢 HSET 而不是 KEYS.Redis 仅在密钥空间管理上就消耗大量内存.简而言之,1 个具有 1,000,000 个键的 HSET 所消耗的内存比 1,000,000 个每个具有一个值的键少 10 倍.
- 保持 HSET 大小小于
hash-max-zipmap-entries
和有效的hash-max-zipmap-value
如果内存是主要目标.请务必了解hash-max-zipmap-entries
和hash-max-zipmap-value
的含义.另外,请花一些时间阅读有关 ziplist 的信息. - 您实际上不想使用 10M+ 键处理
hash-max-zipmap-entries
;相反,您应该将一个 HSET 分成多个插槽.例如,您将hash-max-zipmap-entries
设置为 10,000.因此,要存储 1000 多个密钥,您需要 1000 多个 HSET 密钥,每个密钥具有 10,000 个.作为一个粗略的经验法则:crc32(key) % maxHsets. - 阅读 redis 中的字符串 并使用基于 HSET 的密钥名称(在 HSET 中)长度此结构的实际内存管理.简单来说,将密钥长度保持在 7 字节以下,每个密钥会花费 16 字节,但一个 8 字节的密钥每个会花费 48 字节.为什么?阅读简单动态字符串.
- Prefer HSET to KEYS. Redis consumes a lot of memory just on key space management. In simple (and rough) terms, 1 HSET with 1,000,000 keys consumes up to 10x less memory than 1,000,000 keys with one value each.
- Keep HSET size less then
hash-max-zipmap-entries
and validhash-max-zipmap-value
if memory is the main target. Be sure to understand whathash-max-zipmap-entries
andhash-max-zipmap-value
mean. Also, take some time to read about ziplist. - You actually do not want to handle
hash-max-zipmap-entries
with 10M+ keys; instead, you should break one HSET into multiple slots. For example, you sethash-max-zipmap-entries
as 10,000. So to store 10M+ keys you need 1000+ HSET keys with 10,000 each. As a rough rule of thumb: crc32(key) % maxHsets. - Read about strings in redis and use a KEY name (in HSET) length based on real memory management for this structure. In simple terms, keeping key length under 7 bytes, you spend 16 bytes per key, but an 8-byte key spends 48 bytes each. Why? Read about simple dynamic strings.
阅读以下内容可能会有所帮助:
It may be useful to read about:
- Redis 内存优化(来自 sripathikrishnan)
- 关于内部 ziplist 结构的评论.
- 存储数百个Redis (Instagram) 中数百万个简单的键值对
- Redis Memory Optimization (from sripathikrishnan)
- Comments about internal ziplist structure.
- Storing hundreds of millions of simple key-value pairs in Redis (Instagram)
这篇关于Redis — 存储大地图(字典)的最佳方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!