Django模型选择:IntegerField vs CharField [英] Django Model Choices: IntegerField vs CharField

查看:1419
本文介绍了Django模型选择:IntegerField vs CharField的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL; DR :我有一个数百万个实例的表,我想知道如何索引。

TL;DR: I have a table with millions of instances and I'm wondering how should I index it.

我有一个使用SQL Server作为数据库后端的Django项目。

I have a Django project that uses SQL Server as the database backend.

在生产环境中拥有大约1400万个实例的模型后,我意识到我正在获得性能问题:

After having a model with around 14 million instances in the Production environment, I realized that I was getting performance issues:

class UserEvent(models.Model)

    A_EVENT = 'A'
    B_EVENT = 'B'

    types = (
        (A_EVENT, 'Event A'),
        (B_EVENT, 'Event B')
    )

    event_type = models.CharField(max_length=1, choices=types)

    contract = models.ForeignKey(Contract)

    # field_x = (...)
    # field_y = (...)

我在此字段中使用了大量查询,并且它的效率非常低,因为该领域没有被索引。仅使用此字段过滤模型需要近7秒,而通过索引的外键进行查询不会带来性能问题:

I use a lot of queries based in this field, and it is being highly inefficient, since the field isn't indexed. Filtering the model using only by this field takes almost 7 seconds, while querying by an indexed foreign key doesn't carry performance issues:

UserEvent.objects.filter(event_type=UserEvent.B_EVENT).count()
# elapsed time: 0:00:06.921287

UserEvent.objects.filter(contract_id=62).count()
# elapsed time: 0:00:00.344261

当我意识到这一点也向自己提出了一个问题:这个字段不应该是SmallIntegerField吗?由于我只有一小部分选择,而基于整数字段的查询比基于文本/ varchar的查询更有效。

When I realized this, I also made a question to myself: "Shouldn't this field be a SmallIntegerField? Since I only have a small set of choices, and queries based in integer fields are more efficient than text/varchar based queries."

所以,从我的理解,我有两个选项*:

So, from what I understand, I have two options*:


*我意识到第三个选项可能存在,因为索引低基数字段可能不会导致严重的改进,但是由于我的值为[1% - 99%]分配(我正在寻找1%的部分),索引此字段似乎是一个有效的选项。

*I realize that a third option may exist, since indexing fields with low cardinality may not cause severe improvements, but since my values have a [1%-99%] distribution (and I'm looking for the 1% part), indexing this field seems to be a valid option.



查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆