使用postgres将三字组相似度和全文本搜索与Q ind django结合使用时,性能较差 [英] Poor Performance when trigram similarity and full-text-search were combined with Q ind django using postgres

查看:66
本文介绍了使用postgres将三字组相似度和全文本搜索与Q ind django结合使用时,性能较差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个Web应用程序以搜索具有其属性(例如教育,经验等)的人员.我不能对所有字段都使用全文搜索,因为某些字段必须模糊匹配.(例如:如果我们搜索生物技术,则应该选择生物技术,生物技术以及生物技术).我的数据库在配置文件模型中大约有200个条目,这些条目将出现在搜索结果中.

I'm creating a web application to search people with their properties such as education, experience, etc. I can't use full-text-search for all the fields, because, some has to be a fuzzy match. (Eg: if we search for biotech, it should pick bio tech, biotech and also bio-tech). My database has about 200 entries in the profile model, which is to appear in the search results.

教育和经验等其他模型通过外键连接到个人资料

Other models like education and experience are connected to profile through foreign key

因此,我决定选择在哪个领域使用哪种方法.对于学位名称之类的较短字段(在教育"模型中),我想使用三字母组相似度.对于教育说明等领域,我使用全文搜索.

Therefore, I decided to be selective on what method to use on what field. For shorter fields like degree name (In the Education model) I want to use trigram similarity. For fields like education description, I use Full-text search.

但是,由于必须在多个字段中执行此操作,因此我使用了简单的查找,而不是使用搜索向量.

However, Since I have to do this in multiple fields, I used simple lookups instead of using search vectors.

Profile.objects.filter(
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(vision_expertise__search=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(educations__description__search=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(experiences__description__search=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(publications__description__search=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term) |
    Q(bio_description__search=search_term) |
)

每次搜索都能得到预期的结果.但是,获得它所花费的时间太慢了.我不知道如何使它更快.

I get the expected results on every search. However, the time it takes to get it is ridiculously slow. I can't figure it out how to make this faster.

推荐答案

没有类代码,很难找到更好的方法来优化查询.

Without the class code it's difficult to find the better way to optimize your query.

您可以添加 Gin Gist 索引可加快Trigram相似度.

You can add a Gin or Gist index to speed up the trigram similarity.

您可以使用 SearchVector 构建注释a>如下:

You can build an annotation with the SearchVector as below:

from django.contrib.postgres.aggregates import StringAgg
from django.contrib.postgres.search import SearchQuery, SearchVector

search_vectors = (
    SearchVector('vision_expertise') +
    SearchVector('bio_description') +
    SearchVector(StringAgg('experiences__description', delimiter=' ')) +
    SearchVector(StringAgg('educations__description', delimiter=' ')) +
    SearchVector(StringAgg('publications__description', delimiter=' '))
)

Profile.objects.annotate(
    search=search_vectors
).filter(
    Q(search=SearchQuery(search_term)) |
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term)
)

您可以使用要了解全文搜索和Trigram的知识,可以阅读我写的关于该主题的文章:

To find out about full-text search and trigram you can read the article I wrote on the subject:

全文使用PostgreSQL在Django中搜索"

这篇关于使用postgres将三字组相似度和全文本搜索与Q ind django结合使用时,性能较差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆