在 python 中从 redis 和内存中进行基准检索(使用 timeit) [英] Benchmarking retrieval from redis vs memory in python (using timeit)

查看:50
本文介绍了在 python 中从 redis 和内存中进行基准检索(使用 timeit)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数字列表.这个列表有两种存储方式:要么作为内存中的python对象,要么作为redis列表(redis设置在同一台服务器上).

I have a list of numbers. This list is stored in two ways: either as an in-memory python object, or as a redis list (redis set up in the same server).

我正在使用 python 的 timeit 比较检索这两个列表所需的时间.这是我在 python shell 中所做的:

I'm comparing the time it takes to retrieve these two lists, using python's timeit. Here's what I do in the python shell:

import timeit
import redis
POOL = redis.ConnectionPool(host='127.0.0.1',port=6379,db=0)
my_server = redis.Redis(connection_pool=POOL)
print min(timeit.Timer('pylist1 = my_server.lrange("nums:5",0,-1)', setup='from __main__ import my_server').repeat(7,1000))

这给了我1.92341279984的时间.

接下来,我像这样对内存中的python对象计时:

Next, I time the in-memory python object like so:

pylist = my_server.lrange("nums:5",0,-1)
print min(timeit.Timer('pylist2 = pylist',setup='from __main__ import pylist').repeat(7,1000))

这给了我 4.29153442383e-05 的时间.IE.它似乎比从 redis 中检索相同列表快 ~45K 倍.

This gives me a time of 4.29153442383e-05. I.e. it seems to be ~45K times faster than retrieving the same list from redis.

我的问题是:我的比较方法是否正确?即,我是否准确地模拟了从 redis 中检索与从内存中检索?对于我想到的用例来说,这是一个巨大的性能提升,但在我实现它之前,只是想确保我没有捏造基准测试.

My question is this: is my comparison approach correct? I.e., am I accurately simulating retrieval from redis vs retrieval from memory? This is a huge performance boost for the use case I have in mind, but before I implement this, just want to be sure I didn't fudge the benchmarking.

推荐答案

在您在这里进行的比较中,您基本上只是测量 Python 将新名称绑定到第二种情况下的值所需的时间.因此,这比与不同的进程 (Redis) 通信要快得多,这并不令我感到惊讶.我想让我感到惊讶的是,如果选项存在只是为了将其保留在内存中,您会考虑从 Redis 获取值.

In the comparison you've put up here, you're basically just measuring how long Python takes to bind a new name to a value in the second case. So it doesn't surprise me that this is vastly faster than communicating with a different process (Redis). I guess what surprises me is that you would consider getting a value from Redis if the option exists simply to keep it in memory.

因此,您首先需要更清楚为什么要为此使用 Redis.它总是比进程内内存慢,不需要基准测试.你需要问为什么我不只是使用 Python 列表和字典"?有几个有效的答案:您的数据太大而无法放入内存,您需要特定于缓存的功能,例如允许值在一段时间后消失,或者您想将其用于 IPC 或持久性.一旦你知道这里的答案,这将通知你想要做的基准测试.问题将更像是我如何获得上面列出的好处/功能以最小的性能损失".Redis 可能不是唯一的答案.您可以考虑使用 shelf 来实现持久性,或者甚至可以考虑使用完整的关系数据库或 Mongo 或其他任何东西.

So, you need to be more clear about why you are using Redis for this in the first place. It will always be slower than in-process memory, no benchmark needed for that. You need to ask "why am I not just using Python lists and dictionaries"? There are several valid answers: your data is too large to fit into memory, you require the cache-specific features like allowing values to disappear after a while, or you want to use it for IPC, or persistence. Once you know the answer here, that will inform the benchmarking you want to do. And the question will be more like "How do I obtain the benefits/features I have listed above for the least performance penalty". Redis may not be the only answer. You may consider shelf for persistence, or perhaps even a full-on relational database or Mongo or whatever.

简而言之,一旦您很好地了解原因,解决的频率就会自行解决.

In short, once you have a good idea of why, the how often solves itself.

这篇关于在 python 中从 redis 和内存中进行基准检索(使用 timeit)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆