我应在二进制数据类型列mysql上使用哪个索引 [英] Which index should I use on binary datatype column mysql

查看:233
本文介绍了我应在二进制数据类型列mysql上使用哪个索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个简单的工具来检查重复的文件(即具有相同数据的文件).机制是使用sha-512算法为每个文件生成哈希,然后将这些哈希存储在MYSQL数据库中.我将哈希值存储在binary(64)唯一的非null列中.每行都有一个唯一的二进制哈希,用于检查文件是否重复.

I am writing a simple tool to check duplicate files(i.e. files having same data). The mechanism is to generate hashes for each file using sha-512 algorithm and then store these hashes in MYSQL database. I store hashes in binary(64) unique not null column. Each row will have a unique binary hash and used to check file is duplicate or not.

-我的问题是-

  1. 我可以在二进制列上使用索引吗,我的默认表排序规则是latin1-默认排序规则?

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation?

为了获得高性能,我应该使用哪种索引机制Btree或Hash?我需要每秒更新或添加100行.

Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

我还应该注意其他哪些事项才能获得最佳性能?

What other things should I take care of to get best performance?

推荐答案

我可以在二进制列上使用索引吗,我的默认表排序规则是latin1-默认排序规则?

Can I use indexes on binary column, my default table collation is latin1 - default collation?

是的,您可以;排序规则仅与字符数据类型相关,而与二进制数据类型无关(它定义了字符的排序方式)—另外,请注意latin1字符编码,而不是排序规则.

Yes, you can; collation is only relevant for character datatypes, not binary datatypes (it defines how characters should be ordered)—also, be aware that latin1 is a character encoding, not a collation.

我应该使用哪种索引机制来获得高性能?我需要每秒更新或添加100行.

Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

请注意,哈希索引仅在MEMORYNDB存储引擎中可用,因此您甚至别无选择.

Note that hash indexes are only available with the MEMORY and NDB storage engines, so you may not even have a choice.

在任何情况下,它们中的任何一个通常都能够满足您的性能标准-尽管对于该特定应用程序,我认为使用B-Tree(已订购)没有任何好处,而散列可以提供更好的性能.因此,如果您选择的话,也可以使用Hash.

In any event, either would typically be able to meet your performance criteria—although for this particular application I see no benefit from using B-Tree (which is ordered), whereas Hash would give better performance. Therefore, if you have the choice, you may as well use Hash.

请参见 B树和哈希索引的比较更多信息.

我还应该注意哪些其他事情才能获得最佳性能?

What other things should I take care of to get best performance?

取决于您对最佳性能"的定义和您的环境.通常,请记住Knuth的格言"过早的优化是万恶之源":也就是说,只有在知道最简单的方法会出现问题时才进行优化.

Depends on your definition of "best performance" and your environment. In general, remember Knuth's maxim "premature optimisation is the root of all evil": that is, only optimise when you know that there will be a problem with the simplest approach.

这篇关于我应在二进制数据类型列mysql上使用哪个索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆