在 Redis 中存储 32 位有符号整数的内存高效方法 [英] Memory efficient way to store 32 bit signed integer in Redis

查看:39
本文介绍了在 Redis 中存储 32 位有符号整数的内存高效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于 Redis 尝试将字符串解析为 64 位有符号整数,因此存储 32 位有符号整数的二进制表示而不是基数 10 的整数字符串是个好主意吗?

在我们的系统中,我们有许多 32 位有符号整数 ID 的列表.

我可以像这样存储它们lpush mykey 102450 -->redis 将 102450 转换为 8 字节长或者像这样存储lpush mykey \x00\x01\x19\x32 --->这只是 4 个字节

解决方案

在内部,Redis 以最有效的方式存储字符串.将整数强制转换为基数为 10 的字符串实际上会占用更多内存.

Redis 存储字符串的方式 -

  1. 小于 10000 的整数存储在共享内存池中,并且没有任何内存开销.如果您愿意,可以通过更改常量 REDIS_SHARED_INTEGERS 来增加此限制在 redis.h 中 并重新编译 Redis.
  2. 大于 10000 且在 long 范围内的整数消耗 8 个字节.
  3. 常规字符串采用 len(string) + 4 个字节作为长度 + 4 个字节用于标记空闲空间 + 1 个字节用于空终止符 + 8 个字节用于 malloc 开销.

在您引用的示例中,它是一个 8 字节的问题,对于长字符串 21 字节.

<块引用><块引用>

那么如果我有一组小于 10,000 的数字,Redis 如何存储我的数字组?

这取决于你有多少元素.

如果您的集合中的元素少于 512 个(请参阅 set-max-intset-entries),则该集合将存储为 IntSet.IntSet 是 Sorted Integer Array 的美称.由于您的数字小于 10000,因此每个元素将使用 16 位.它(几乎)与 C 数组一样有效.

如果您有超过 512 个元素,则该集合将成为一个 HashTable.集合中的每个元素都封装在一个名为 robj 的结构中,该结构的开销为 16 字节.robj 结构有一个指向整数共享池的指针,因此您无需为整数本身支付任何额外费用.最后,robj 实例存储在哈希表中,哈希表的开销与集合的大小成正比.

如果您对元素消耗的内存量感兴趣,请运行 redis-rdb-tools 在你的数据集上(免责声明:我是这个工具的作者).或者你可以阅读 MemoryCallback 类的源代码,注释解释了内存是如何布局的.

Since Redis try to parse strings to 64 bit signed integers, is it a good idea to store binary representation of 32 bit signed integer instead of radix 10 integer strings ?

In our system we have lists of many 32 bit signed integer IDs.

I can store them like
lpush mykey 102450  --> redis cast 102450 to 8 bytes long

or store it like 
lpush mykey  \x00\x01\x19\x32  ---> this is just 4 bytes

解决方案

Internally, Redis stores strings in the most efficient manner. Forcing integers into radix 10 strings will actually use more memory.

Here is how Redis stores Strings -

  1. Integers less than 10000 are stored in a shared memory pool, and don't have any memory overheads. If you wish, you can increase this limit by changing the constant REDIS_SHARED_INTEGERS in redis.h and recompiling Redis.
  2. Integers greater than 10000 and within range of a long consume 8 bytes.
  3. Regular strings take len(string) + 4 bytes for length + 4 bytes for marking free space + 1 byte for null terminator + 8 bytes for malloc overheads.

In the example you quoted, its a question of 8 bytes for a long v/s 21 bytes for the string.

EDIT :

So if I have a set of numbers all less than 10,000 how does Redis store my set?

It depends on how many elements you have.

If you have less than 512 elements in your set (see set-max-intset-entries), then the set will be stored as an IntSet. An IntSet is a glorified name for a Sorted Integer Array. Since your numbers are less than 10000, it would use 16 bits per element. It is (almost) as memory efficient as a C array.

If you have more than 512 elements, the set becomes a HashTable. Each element in the set is wrapped in a structure called robj, which has an overhead of 16 bytes. The robj structure has a pointer to the shared pool of integers, so you don't pay anything extra for the integer itself. And finally, the robj instances are stored in the hashtable, and the hashtable has an overhead that is proportional to the size of the set.

If you are interested in exactly how much memory an element consumes, run redis-rdb-tools on your dataset (disclaimer: I am the author of this tool). Or you can read the sourcecode for the class MemoryCallback, the comments explain how the memory is laid out.

这篇关于在 Redis 中存储 32 位有符号整数的内存高效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆