Python DBM真的快吗? [英] Is Python DBM really fast?

查看:230
本文介绍了Python DBM真的快吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当时认为Python的本机DBM应该比NOSQL数据库(例如Tokyo Cabinet,MongoDB等)要快得多(因为Python DBM具有较少的功能和选项;即,系统更简单).我用一个非常简单的写/读示例进行了测试,

I was thinking that native DBM of Python should be quite faster than NOSQL databases such as Tokyo Cabinet, MongoDB, etc (as Python DBM has lesser features and options; i.e. a simpler system). I tested with a very simple write/read example as

#!/usr/bin/python
import time
t = time.time()
import anydbm
count = 0
while (count < 1000):
 db = anydbm.open("dbm2", "c")
 db["1"] = "something"
 db.close()
 db = anydbm.open("dbm", "r")
 print "dict['Name']: ", db['1'];
 print "%.3f" % (time.time()-t)
 db.close()
 count = count + 1

读/写:1.3s 读:0.3秒 写入:1.0秒

Read/Write: 1.3s Read: 0.3s Write: 1.0s

这些MongoDb的值至少快5倍.真的是Python DBM的性能吗?

These values for MongoDb is at least 5 times faster. Is it really the Python DBM performance?

推荐答案

Python没有内置的DBM实现.它的DBM功能基于各种DBM样式的第三方库,例如AnyDBM,Berkeley DBM和GNU DBM.

Python doesn't have a built-in DBM implementation. It bases its DBM functions on a wide range of DBM-style third party libraries, like AnyDBM, Berkeley DBM and GNU DBM.

Python的字典实现对于键值存储确实非常快,但不是持久的.如果您需要高性能的运行时键值查找,则可能会发现更好的字典-您可以使用点刺式或搁置式来管理持久性.如果启动时间对您很重要(并且如果您要修改数据,终止)-比运行时访问速度更重要-那么像DBM这样的东西会更好.

Python's dictionary implementation is really fast for key-value storage, but not persistent. If you need high-performance runtime key-value lookups, you may find a dictionary better - you can manage persistence with something like cpickle or shelve. If startup times are important to you (and if you're modifying the data, termination) - more important than runtime access speed - then something like DBM would be better.

在评估中,作为主循环的一部分,您包括了dbm open调用和数组查找.打开一个DBM来存储一个值,然后在查找之前关闭并重新打开,这是一个非常不现实的用例,您会看到一种典型的性能下降,即以这种方式管理持久数据存储时(效率很低).

In your evaluation, as part of the main loop you have included both dbm open calls and also array lookup. It's a pretty unrealistic use case to open a DBM to store one value and the close and re-open before looking it up, and you're seeing the typical slow performance that one would when managing a persistent data store in such a manner (it's quite inefficient).

根据您的要求,如果您需要快速查找并且不太在意启动时间,则DBM可能是一种解决方案-但要对其进行基准测试,请仅在循环中包含写入和读取!像下面这样的东西可能是合适的:

Depending on your requirements, if you need fast lookups and don't care too much about startup times, DBM might be a solution - but to benchmark it, only include writes and reads in the loop! Something like the below might be suitable:

import anydbm
from random import random
import time

# open DBM outside of the timed loops
db = anydbm.open("dbm2", "c")

max_records = 100000

# only time read and write operations
t = time.time()

# create some records
for i in range(max_records):
  db[str(i)] = 'x'

# do a some random reads
for i in range(max_records):
  x = db[str(int(random() * max_records))]

time_taken = time.time() - t
print "Took %0.3f seconds, %0.5f microseconds / record" % (time_taken, (time_taken * 1000000) / max_records)

db.close()

这篇关于Python DBM真的快吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆