Redis使用的基础数据结构是什么? [英] What are the underlying data structures used for Redis?

查看:152
本文介绍了Redis使用的基础数据结构是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在一个确定的列表中回答两个问题:


  1. Redis使用的底层数据结构是什么? li>
  2. 每种类型的主要优点/缺点/用例是什么?

所以,我我们读过的Redis列表实际上是用链表来实现的。但对于其他类型,我无法挖掘任何信息。此外,如果有人绊倒了这个问题,并没有对修改或访问不同数据结构的优缺点进行高级总结,他们将有一个完整的列表,以便最佳地使用特定类型



具体来说,我正在查看所有类型:string,list,set,zset和hash。



哦,到目前为止,我已经看过这些文章:




解决方案

我会尝试回答你的问题,但我会从一开始就看起来很奇怪:你对Redis interna不感兴趣ls你不应该关心如何在内部实现数据类型。这是一个简单的原因:对于每个Redis操作,您将在文档中找到时间复杂性,如果您有一组操作和时间复杂性,您需要的唯一其他事情是关于内存使用的一些线索(因为我们做了许多优化,可能因数据而异,获得这些后一个数字的最佳方式正在进行一些微不足道的现实世界测试)。



但是,由于您在此问是每个Redis数据类型的底层实现。




  • 字符串使用C动态字符串库实现,我们不会(渐近地)支付附加操作中的分配。

  • 列表是通过链表实现的。

  • 设置哈希是使用哈希表实现的。

  • 已排序集 跳过列表(一种特殊类型的平衡树)



但是当列表,集合和排序集在项目数量和最大值的大小方面都很小时,使用更紧凑的编码。这种编码对于不同的类型是不同的,但是具有这样一个特征:它是一个紧凑的数据块,其经常迫使O(N)扫描进行每个操作。由于我们仅对小对象使用这种格式,这不是问题;扫描一个小的O(N)blob是缓存忽略,所以实际上这是非常快的,当有太多的元素时,编码自动切换到本机编码(链表,散列等)



但是你的问题不仅仅是内部的,你的观点是什么类型用于完成什么?

>

字符串



这是所有类型的基本类型。它是四种类型之一,但也是复杂类型的基本类型,因为列表是字符串列表,集合是一组字符串等。



在所有需要存储HTML页面的明显情况下,还要避免转换已经编码的数据,Redis字符串是一个好主意。所以,例如,如果你有JSON或者MessagePack,你可能只是将对象存储为字符串。在Redis 2.6中,您甚至可以使用Lua脚本操纵这种对象服务器端。



字符串的另一个有趣的用法是位图,一般来说,随机访问数组的字节,因为Redis导出命令来访问随机字节范围,甚至是单个位。例如检查这个好的博文:使用Redis的快速简单实时指标



列表



当您有可能时,列表很好仅触及列表的极端:靠近尾巴或靠近头部。列表不是很好的分页的东西,因为随机访问是缓慢的,O(N)。
所以很好地使用列表是纯粹的队列和堆栈,或者使用具有相同源和目的地的RPOPLPUSH来循环处理项目以旋转一个项目环。



列表也很好,当我们想要创建一个N个项目的上限集合,其中通常我们只访问顶部或底部项目,或者当N小时。



设置



集合是一个无序的数据集合,因此每当您收集项目时它们都很好,并且非常重要的是检查收藏的存在或大小以非常快的方式。关于set的另一个很酷的东西是支持窥探或弹出随机元素(SRANDMEMBER和SPOP命令)。



集合也很好地表示关系,例如什么是用户X?等等。但是这种东西的其他好的数据结构是排序集,我们将看到。



设置支持复杂的操作,如交叉,联合等等,所以这是一个很好的数据结构,以计算的方式使用Redis,当你有数据,并且你想对该数据执行转换以获得一些输出。



小套是以非常有效的方式编码。



哈希



哈希是表示对象的完美数据结构,由字段和值。哈希字段也可以使用HINCRBY原子增加。当您拥有诸如用户,博客文章或某种其他类型的项目等对象时,如果您不想使用自己的JSON或类似编码,那么哈希可能会走了。但是,请注意,Redis可以非常有效地编码小型哈希值,您可以要求Redis以非常快的速度原子地获取,设置或增加各个字段。 p>

哈希也可用于表示链接的数据结构,使用引用。例如检查lamernews.com的注释实现。



排序集



排序集是除了列表之外,只有其他数据结构来维护有序元素。你可以用排序集来做一些很酷的东西。例如,您可以在Web应用程序中拥有各种 Top Something 列表。顶级用户按分数排名,顶页面浏览量排名首位,但是单个Redis实例每秒可以支持大量的插入和get-top-elements操作。



排序集,像常规集,可以用来描述关系,但是它们也允许你分列项目列表并记住顺序。例如,如果我记住用户X的朋友用一个排序集,我可以很容易地记住他们按照接受的友谊的顺序。



排序集对优先级队列是有好处的。 p>

排序集就像更强大的列表,从列表中间插入,删除或获取范围总是很快。但是他们使用更多的内存,并且是O(log(N))数据结构。



结论



I希望我在这篇文章中提供了一些信息,但是从 http:// github.com/antirez/lamernews 并了解它的工作原理。来自Redis的许多数据结构在Lamer News中使用,并且有很多关于用于解决给定任务的线索。



对于语法打字错误,这是午夜,太累了,不能查看帖子;)


I'm trying to answer two questions in a definitive list:

  1. What are the underlying data structures used for Redis?
  2. And what are the main advantages/disadvantages/use cases for each type?

So, I've read the Redis lists are actually implemented with linked lists. But for other types, I'm not able to dig up any information. Also, if someone were to stumble upon this question and not have a high level summary of the pros and cons of modifying or accessing different data structures, they'd have a complete list of when to best use specific types to reference as well.

Specifically, I'm looking to outline all types: string, list, set, zset and hash.

Oh, I've looked at these article, among others, so far:

解决方案

I'll try to answer your question, but I'll start with something that may look strange at first: if you are not interested in Redis internals you should not care about how data types are implemented internally. This is for a simple reason: for every Redis operation you'll find the time complexity in the documentation and, if you have the set of operations and the time complexity, the only other thing you need is some clue about memory usage (and because we do many optimizations that may vary depending on data, the best way to get these latter figures are doing a few trivial real world tests).

But since you asked, here is the underlying implementation of every Redis data type.

  • Strings are implemented using a C dynamic string library so that we don't pay (asymptotically speaking) for allocations in append operations. This way we have O(N) appends, for instance, instead of having quadratic behavior.
  • Lists are implemented with linked lists.
  • Sets and Hashes are implemented with hash tables.
  • Sorted sets are implemented with skip lists (a peculiar type of balanced trees).

But when lists, sets, and sorted sets are small in number of items and size of the largest values, a different, much more compact encoding is used. This encoding differs for different types, but has the feature that it is a compact blob of data that often forces an O(N) scan for every operation. Since we use this format only for small objects this is not an issue; scanning a small O(N) blob is cache oblivious so practically speaking it is very fast, and when there are too many elements the encoding is automatically switched to the native encoding (linked list, hash, and so forth).

But your question was not really just about internals, your point was What type to use to accomplish what?.

Strings

This is the base type of all the types. It's one of the four types but is also the base type of the complex types, because a List is a list of strings, a Set is a set of strings, and so forth.

A Redis string is a good idea in all the obvious scenarios where you want to store an HTML page, but also when you want to avoid converting your already encoded data. So for instance, if you have JSON or MessagePack you may just store objects as strings. In Redis 2.6 you can even manipulate this kind of object server side using Lua scripts.

Another interesting usage of strings is bitmaps, and in general random access arrays of bytes, since Redis exports commands to access random ranges of bytes, or even single bits. For instance check this good blog post: Fast Easy real time metrics using Redis.

Lists

Lists are good when you are likely to touch only the extremes of the list: near tail, or near head. Lists are not very good to paginate stuff, because random access is slow, O(N). So good uses of lists are plain queues and stacks, or processing items in a loop using RPOPLPUSH with same source and destination to "rotate" a ring of items.

Lists are also good when we want just to create a capped collection of N items where usually we access just the top or bottom items, or when N is small.

Sets

Sets are an unordered data collection, so they are good every time you have a collection of items and it is very important to check for existence or size of the collection in a very fast way. Another cool thing about sets is support for peeking or popping random elements (SRANDMEMBER and SPOP commands).

Sets are also good to represent relations, e.g., "What are friends of user X?" and so forth. But other good data structures for this kind of stuff are sorted sets as we'll see.

Sets support complex operations like intersections, unions, and so forth, so this is a good data structure for using Redis in a "computational" manner, when you have data and you want to perform transformations on that data to obtain some output.

Small sets are encoded in a very efficient way.

Hashes

Hashes are the perfect data structure to represent objects, composed of fields and values. Fields of hashes can also be atomically incremented using HINCRBY. When you have objects such as users, blog posts, or some other kind of item, hashes are likely the way to go if you don't want to use your own encoding like JSON or similar.

However, keep in mind that small hashes are encoded very efficiently by Redis, and you can ask Redis to atomically GET, SET or increment individual fields in a very fast fashion.

Hashes can also be used to represent linked data structures, using references. For instance check the lamernews.com implementation of comments.

Sorted Sets

Sorted sets are the only other data structures, besides lists, to maintain ordered elements. You can do a number of cool stuff with sorted sets. For instance, you can have all kinds of Top Something lists in your web application. Top users by score, top posts by pageviews, top whatever, but a single Redis instance will support tons of insertion and get-top-elements operations per second.

Sorted sets, like regular sets, can be used to describe relations, but they also allow you to paginate the list of items and to remember the ordering. For instance, if I remember friends of user X with a sorted set I can easily remember them in order of accepted friendship.

Sorted sets are good for priority queues.

Sorted sets are like more powerful lists where inserting, removing, or getting ranges from the the middle of the list is always fast. But they use more memory, and are O(log(N)) data structures.

Conclusion

I hope that I provided some info in this post, but it is far better to download the source code of lamernews from http://github.com/antirez/lamernews and understand how it works. Many data structures from Redis are used inside Lamer News, and there are many clues about what to use to solve a given task.

Sorry for grammar typos, it's midnight here and too tired to review the post ;)

这篇关于Redis使用的基础数据结构是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆