长ID的性能 [英] Performance of Long IDs

查看:66
本文介绍了长ID的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在想这个问题。在CouchDB中,我们有一些相当的日志ID ...例如:

I've been wondering about this for some time. In CouchDB we have some fairly log IDs...eg:

000ab56cb24aef9b817ac98d55695c6a

"000ab56cb24aef9b817ac98d55695c6a"

现在,如果重新搜索此项,并浏览视图创建的树结构。似乎是一个简单的整数,因为id会更快。如果我们使用64位整数,那将是一个简单的CMP,然后是一个JMP(假设Erlang代码使用的是JIT,但我明白了)。

Now if we're searching for this item and going through the tree structure created by the view. It seems a simple integer as an id would be much faster. If we used 64bit integers it would be a simple CMP followed by a JMP (assuming that the Erlang code was using JIT, but you get my point).

对于字符串,我认为我们会根据ID或其他内容生成一个哈希,但是在某些时候,我们必须对所有33个字符进行字符比较...

For strings, I assume we generate a hash off the ID or something, but at some point we have to do a character compare on all 33 characters...won't that affect performance?

推荐答案

简短的答案是,是的,当然会影响性能,因为密钥长度会直接影响性能。

The short answer is, yes, of course it will affect performance, because the key length will directly impact the time it takes to walk down the tree.

它还会影响存储,因为较长的键占用更多空间,所以空间也需要时间。

It also affects storage, as longer keys take more space, space takes time.

但是,您所缺少的细微差别是,虽然Couch CAN(并且确实)为您分配了新ID,但这不是必需的。乐于接受您自己的ID,而不是自己生成ID。因此,如果密钥长度困扰您,您可以自由使用较短的密钥。

However, the nuance you are missing is that while Couch CAN (and does) allocated new IDs for you, it is not required to. It will be more than happy to accept your own IDs rather than generate it's own. So, if the key length bothers you, you are free to use shorter keys.

但是,考虑到沙发的 json性质,它几乎是文本基于数据库。在普通的Couch实例中没有存储太多的二进制数据(没有附加的附件,但是即使我认为那些存储在BASE64中,我也可能错了。)

However, given the "json" nature of couch, it's pretty much a "text" based database. There's isn't a lot of binary data stored in a normal Couch instance (attachments not withstanding, but even those I think are stored in BASE64, I may be wrong).

因此,虽然是的,最有效的是64位,但简单的事实是Couch设计为可用于任何键,而任何键最容易用文本表示。

So, while, yes an 64-bit would be the most efficient, the simple fact is that Couch is designed to work for any key, and "any key" is most readily expressed in text.

最后,说实话,密钥比较的成本与磁盘I / O的获取时间和JSON数据整理(尤其是写入)相比,相形见war。转换为这样的系统所获得的任何实际收益都可能不会对整体性能产生真实世界的影响。

Finally, truth be told, the cost of the key compare is dwarfed by the disk I/O fetch times, and the JSON marshaling of data (especially on writes). Any real gain achieved by converting to such a system would likely have no "real world" impact on overall performance.

如果您想真正加快Couch密钥系统的运行,对密钥例程进行编码,以将密钥阻止为64Bit long,并进行比较(如您所说)。 8个字节的文本与64位 long int相同。从理论上讲,这将使关键比较的性能提高8倍。我不能说erlang是否可以创建这样的代码。

If you want to really speed up the Couch key system, code the key routine to block the key in to 64Bit longs, and comapre those (like you said). 8 bytes of text is the same as a 64 bit "long int". That would give you, in theory, an 8x performance boost on key compares. Whether erlang can create such code, I can't say.

这篇关于长ID的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆