针对文档的SQL Server全文本搜索(多个相关表格和字段) [英] SQL Server Full-Text Search against Document (multiple related tables and field)

查看:563
本文介绍了针对文档的SQL Server全文本搜索(多个相关表格和字段)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多个表的文档(在这种情况下是Invoice)结构:
$ b $ ul

  • 发票标题(编号(PK),客户名称,客户地址,...)


  • 发票行(发票号(PK),行号(PK),说明,数量,...)


  • 发票抬头注释(发票号(PK),注释号(PK),注释)



  • 当我运行搜索时,我想针对整个文档执行它(作为一个实体,而不是针对单独的字段)(客户名称+客户地址+描述+评论)。



    示例:所有与Bicycle AND Berlin或Munich or Berlin或Fast delivery有关的文件。 b

    你会推荐什么方法来解决这个问题?

    我应该创建一个单独的索引表来存储来自所有字段的连接值喜欢索引(客户名称,客户地址,Des

    文档索引(文档编号(PK),索引)
    在此case我应该如何保持文件索引表最新?

    我试图创建连接值的索引视图,但得到了限制 - 索引视图不能包含子查询或使用其他视图。



    我会很感激所有想法。

    解决方案

    如果您需要对搜索结果进行排名(评分)或排序,则应该创建一个新表,通过ETL流程将所有全文搜索数据(发票标题,行,注释)组合为您的实体分成1列。这似乎是你用你的Document Index表的想法所建议的。

    为什么要将它们组合成一个表?比如果您将全文索引应用于每个现有表格的排名更好。前一种解决方案产生单一等级,而后者将为每个表格产生不同的等级,并且没有准确的方法将多个等级(其基于完全不同的等级)解析为1等级。为了说明不同:

       - 查询1个表
    SELECT RANK,KEY从CONTAINSTABLE(DocumentIndex。*,@ searchString)

    - 查询多个表(这会导致多个等级值无法解析为单个等级)
    SELECT RANK,KEY FROM CONTAINSTABLE(InvoiceHeader。*,@searchString)

    SELECT RANK,KEY从CONTAINSTABLE(InvoiceLines。*,@searchString)

    SELECT RANK,KEY从CONTAINSTABLE(InvoiceHeaderComments。*,@searchString)

    您如何将它们合并到一个表中?您需要某种类型的ETL过程,一个时间表(可能更容易实现,但会导致全文索引与主表不同步)或按需运行(无论何时修改主表)(通过触发器或通过挂接到事件在你的数据层)。


    I have a document (in this case Invoice) structure which contains multiple tables:

    • Invoice Header (No. (PK), Customer Name, Customer Address, ...)

    • Invoice Lines (Invoice No. (PK), Line No. (PK), Description, Qty., ...)

    • Invoice Header Comments (Invoice No. (PK), Comment No. (PK), Comment)

    When I run a search I would like to execute it against whole document (as one entity, not against separate fields (Customer Name + Customer Address + Description + Comment).

    Example: All documents which have something to do with "Bicycle AND Berlin" or "Munich OR Berlin" or "'Fast delivery'"....

    What approach would you recommend to solve this problem?

    Should I create a separate Index table to store concatenated values from all field which I would like to index (Customer Name, Customer address, Description, Comment) - one row per document:

    Document Index (Document No. (PK), Index) In this case how should I keep "Document Index" table up to date?

    I tried to create indexed views which concatenate values, but got to the limitation - indexed view can't contain subselects or use other views.

    I would appreciate all ideas.

    解决方案

    If you need to rank (score) or sort your search results, you should create a new table which, through an ETL process, combines all of the full-text-searchable data (invoice header, lines, comments) for your entity into 1 column. This seems to be what you're suggesting with your "Document Index" table idea.

    Why combine them into 1 table? This approach results in better ranking than if you were to apply full text indexes to each existing table. The former solution produces a single rank whereas the latter will produce a different rank for each table and there is no accurate way to resolve multiple ranks (which are based on completely different scales) into 1 rank. To illustrate the differences:

    -- Querying 1 table
    SELECT RANK, KEY FROM CONTAINSTABLE(DocumentIndex.*, @searchString)
    
    -- Querying multiple tables (this results in multiple rank values which cannot be resolved into a single rank)
    SELECT RANK, KEY FROM CONTAINSTABLE(InvoiceHeader.*, @searchString)
    
    SELECT RANK, KEY FROM CONTAINSTABLE(InvoiceLines.*, @searchString)
    
    SELECT RANK, KEY FROM CONTAINSTABLE(InvoiceHeaderComments.*, @searchString)
    

    How can you combine them into 1 table? You will need some sort of ETL process which either runs on a schedule (which may be easier to implement but will result in lag time where your full text index is out of sync with the master tables) or gets run on demand whenever your master tables are modified (either via triggers or by hooking into an event in your data layer).

    这篇关于针对文档的SQL Server全文本搜索(多个相关表格和字段)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆