SQLite 表需要时间来获取 LIKE 查询中的记录 [英] SQLite table taking time to fetch the records in LIKE query

查看:49
本文介绍了SQLite 表需要时间来获取 LIKE 查询中的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

场景:数据库是sqlite(需要对数据库中的记录进行加密.因此使用了iOS的SQL cipher API)

Scenario: database is sqlite (need to encrypt records in database. Hence used SQL cipher API for iOS)

数据库中有一个名为partnumber的表,其架构如下:

There is a table in the database named partnumber with schema as follows:

CREATE TABLE partnumber (
  objid varchar PRIMARY KEY,
  description varchar,
  make varchar,
  model varcha,
  partnumber varchar,
  SSOKey varchar,
  PMOKey varchar
)

该表包含大约 80K 条记录.

This table contains approximately 80K records.

UI 视图中有 3 个文本字段,用户可以在其中输入搜索词,只要用户在那里输入字母就会进行搜索.

There are 3 text fields in the UI view, in which user can enter search terms and searching is made as soon as user enters the letters there.

3 个文本字段是:txtFieldDescription、txtFieldMake 和 txtFieldModel.

3 text fields are: txtFieldDescription, txtFieldMake and txtFieldModel.

假设第一个用户在 txtFieldDescription 中输入搜索词作为监视器".因此,将使用每个字母执行的查询是:

Suppose, first user enters the search term as ‘monitor’ in txtFieldDescription. So, the queries that will be executed with each letter are:

1.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%m%’

2.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%mo%’

3.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%mon%’

4.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%moni%’

5.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%monit%’

6.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%monito%’

7.

    SELECT DISTINCT description COLLATE NOCASE
    FROM partnumber where description like ‘%monitor%’

到目前为止一切顺利.现在假设用户想要搜索模型(txtFieldDescription 仍然包含‘monitor’).所以用户点击txtFieldModel.一旦用户点击模型,就会触发一个查询:

So far so good. Suppose now if user wants to search for model (txtFieldDescription still contains ‘monitor’). So user clicks on txtFieldModel. As soon as user clicks on model, a query is fired as:

SELECT DISTINCT model COLLATE NOCASE
FROM partnumber where description like ‘%monitor%’

此查询将返回描述包含监视器(在任何位置)的记录的所有模型.

This query will return all the models for the records where description contains monitor (at any position).

现在,如果用户想要搜索所有包含单词sony"的模型(描述字段仍然包含监视器),那么将使用每个字母执行的查询是:

Now, if user wants to search for all the models containing word ‘sony’ (description field still contains monitor), then the queries that will get executed with each letter are:

1.

    SELECT DISTINCT model COLLATE NOCASE
    FROM partnumber WHERE model like ‘%s%’ AND description like ‘%monitor%’

2.

    SELECT DISTINCT model COLLATE NOCASE
    FROM partnumber WHERE model like ‘%so%’ AND description like ‘%monitor%’

3.

    SELECT DISTINCT model COLLATE NOCASE
    FROM partnumber WHERE model like ‘%son%’ AND description like ‘%monitor%’

4.

    SELECT DISTINCT model COLLATE NOCASE
    FROM partnumber WHERE model like ‘%sony%’ AND description like ‘%monitor%’

现在,如果用户点击 txtFieldMake 并输入搜索词1980",那么被触发的查询是:

Now, if user clicks on txtFieldMake and enters search term as ‘1980’, then the queries that get fired are:

1.

    SELECT DISTINCT make COLLATE NOCASE
    FROM partnumber WHERE make like ‘%1%’
      AND model like ‘%sony%’ AND description like ‘%monitor%’

2.

    SELECT DISTINCT make COLLATE NOCASE
    FROM partnumber WHERE make like ‘%19%’
      AND model like ‘%sony%’ AND description like ‘%monitor%’

3.

    SELECT DISTINCT make COLLATE NOCASE
    FROM partnumber WHERE make like ‘%198%’
      AND model like ‘%sony%’ AND description like ‘%monitor%’

4.

    SELECT DISTINCT make COLLATE NOCASE
    FROM partnumber WHERE make like ‘%1980%’
      AND model like ‘%sony%’ AND description like ‘%monitor%’

这里,从 txtFieldDescription 到 txtFieldModel 或 txtFieldModel 到 txtFieldMake 的转换时间延迟太大,在 txtFieldModel 和 txtFieldMake 中,输入的字母在 5 或 6 秒后显示(在查询处理后),因此光标挂在那里.

Here, the time delay in making a transition from txtFieldDescription to txtFieldModel or txtFieldModel to txtFieldMake is too large and in txtFieldModel and txtFieldMake, the letter entered are shown after 5 or 6 secs (after the query has been processed) and hence the cursor hangs there.

在分析中,我开始知道在 like 关键字(如%monitor%")中的搜索词之前的通配符会减慢执行速度.在这种情况下,可能有多达 3 个这样的关键字,它们之间带有 AND,因此,执行时间肯定会增加.此外,在 like 开头使用通配符会否定索引.

On analyzing, I came to know that wildcard before the searching term in like keyword (as in ‘%monitor%’) slows the execution. And in this case there may be as many as 3 such like keywords with AND in between them and hence, execution time is sure to get increase. Also, use of wildcard at the beginning of like negates the indices.

一些附加信息:

  1. 总记录数~80K

  1. The total number of records ~80K

SELECT 查询每次在表部件号 (~80K) 上运行

SELECT query is run each time on the table partnumber (~80K)

我执行的一些查询的结果:

Result of some queries performed by me:

Sqlite> SELECT count(DISTINCT description COLLATE NOCASE) from partnumber;
    Result is: 2599

Sqlite> SELECT count(DISTINCT make COLLATE NOCASE) from partnumber;
    Result is: 7129

Sqlite> SELECT count(DISTINCT model COLLATE NOCASE) from partnumber;
    Result is: 64644

Sqlite> SELECT count(objid) from partnumber;
    Result is: 82135

  • 索引创建如下:

  • Indices are created as follows:

    CREATE INDEX index_description
    ON partnumber (description collate nocase)
    
    CREATE INDEX index_make
    ON partnumber (make collate nocase)
    
    CREATE INDEX index_model
    ON partnumber (model collate nocase)
    

  • 一些提高性能的替代方法:

    SOME ALTERNATIVES TO INCREASE PERFORMANCE:

    1. 由于distinct description的个数只有2599个,make的个数只有7129个,所以可以把表拆分成不同的表,一个包含DISTINCT description COLLATE NOCASE输出(共2599行),一个包含DISTINCT制作 COLLATE NOCASE(共 7129 行).就模型而言,为它制作不同的表无济于事,因为行数 ~64644 几乎等于总记录数 ~82135.但是这种方法的问题是我不知道如何在这些表中进行搜索,每个表中必须有哪些列以及必须创建多少表.如果用户输入一些描述,然后输入模型,然后再次输入新的描述会怎样.

    1. Since the count of distinct description is only 2599 and that of make is only 7129, so the table can be split into different tables with one containing DISTINCT description COLLATE NOCASE output (total of 2599 rows) and one containing DISTINCT make COLLATE NOCASE (total of 7129 rows). As far as model is concerned, making a different table for it will not help as the number of rows ~64644 is nearly equal to total records ~82135. But the problem with this approach is that I don’t know how I would be making a search in these tables, what columns must be there in each of them and how many tables must be created. What if user enters some description then enters model and then again enters a new description.

    由于此选择查询的结果显示在 UITableView 中,并且用户一次最多看到 5 行.因此,我们可以将返回的行数限制为 500,当用户滚动时,可以提取下一个 500 行,依此类推,直到最后搜索到的记录.

    Since the result of this select query is being displayed in a UITableView and the user sees at max 5 rows at a time. So, we can limit the number of rows that are being returned to 500 and when user scrolls, then next 500 can be fetched and so on till the last searched record.

    但这里的问题是虽然我只需要 500 条记录,但我必须搜索整个表(扫描 ~80K 条记录).所以,我需要一个查询,它首先只搜索表的前 10% 并从中返回前 500 行,然后搜索下 500 到前 10% 的记录,然后搜索下一个 10%,然后下一个 10% 直到 80000 条记录是正在搜索(需要搜索 10-10% 的记录块).

    But the problem here is although I need only 500 records, but I will have to search entire table (SCAN ~80K records). So, I need a query that will first search only top 10% of the table and return top 500 rows from this, then next 500 till top 10% records are all searched, then next 10%, then next 10% till 80000 records are being searched (need to search in chunks of 10-10% records).

    1. 如果80K记录的表可以拆分成4个20K记录的表,然后同时(在不同的后台线程中)对所有4个表进行搜索以获得结果集.但是这里我不知道如何在 4 个不同的线程中运行查询(某种批处理执行),何时合并结果以及如何知道所有线程都已执行完毕.

    1. If the table of 80K records can be split into 4 tables of 20K records each and then searching is performed on all 4 tables simultaneously (in different background threads) to get the result set. But here I don’t know how to run the queries in 4 different threads (sort of batch execution), when to combine the results and how to know that all the threads have finished execution.

    如果我可以用另一个返回相同结果但执行速度更快的函数替换 like %monitor%' 并且该函数的使用不影响索引的使用,(即不通过-通过使用索引),那么执行可能会更快.如果有人可以在sqlite中向我推荐这样的功能,那么我可以继续使用这种方法.

    If I can replace like %monitor%‘ with another function that returns the same result but whose execution is faster and the use of that function does not affects the use of index, (that is, does not by-passes the use of index), then the execution may get faster. If anyone can suggest me such a function in sqlite, then I can go on with this approach.

    如果您能帮助我实现这些替代方案中的任何一种,或者如果您能建议我任何其他解决方案,那么我将能够提高查询的执行速度.并且请不要告诉我在 sqlite 中启用 FTS(全文搜索),因为我已经尝试过这样做,但我不知道确切的步骤.非常感谢你这么耐心的看完这个问题......

    If you can help me to implement any one of these alternatives, or if you can suggest me any other solution, then I would be able to increase the execution speed of my query. And pls dont tell me to enable FTS (Full Text Searching) in sqlite because I have already tried doing this, but I dont know the exact steps. Thanks a lot for reading this question so patiently......

    大家好,我取得了一些成功.我修改了我的选择查询,如下所示:

    Hey All, I got some success. I modified my select query to look like this:

    select distinct description collate nocase as description from partnumber where rowid BETWEEN 1 AND (select max(rowid) from partnumber) AND description like '%a%' order by description;
    

    还有宾果游戏,搜索时间前所未有.但现在的问题是,当我像这样执行 EXPLAIN QUERY PLAN 命令时,它显示我使用 B-Tree 来区分我不想使用的.

    And Bingo, the search time was like never before. But the problem now is when I execute the command EXPLAIN QUERY PLAN like this, it shows me using B-Tree for distinct which I dont want to use.

    explain query plan select distinct description collate nocase as description from partnumber where rowid BETWEEN 1 AND (select max(rowid) from partnumber) AND description like '%a%' order by description;
    

    输出:

    0|0|0|SEARCH TABLE partnumber USING INTEGER PRIMARY KEY (rowid>? AND rowid<?) (~15625 rows)
    0|0|0|EXECUTE SCALAR SUBQUERY 1
    1|0|0|SEARCH TABLE partnumber USING INTEGER PRIMARY KEY (~1 rows)
    0|0|0|USE TEMP B-TREE FOR DISTINCT
    

    对不起各位.上述方法(使用 rowid 进行搜索)在设备上花费的时间比原始方法多.我曾尝试通过关键字删除不同的和顺序,但它没有用.在 iPhone 上仍然需要大约 8-10 秒.请帮帮我.

    Sorry guys. The above approach (using rowid for searching) is taking more time on device than the original one. I have tried removing the distinct and order by keywords, but it was of no use. Still taking ~8-10 secs on iPhone. Pls help me out.

    推荐答案

    Anshul,

    我知道您说过请不要告诉我在 sqlite 中启用 FTS(全文搜索),因为我已经尝试过这样做,但我不知道确切的步骤",但是 FTS 是您将要获得的唯一方法这表现得很好.没有什么魔法可以使全表扫描表现良好.我建议阅读 FTS,花时间学习它,然后使用它:http://sqlite.org/fts3.html.

    I know you said "pls dont tell me to enable FTS (Full Text Searching) in sqlite because I have already tried doing this, but I dont know the exact steps", however FTS is the only way you are going to get this to perform well. There is no magic that will make a full table scan perform well. I's suggest reading up on FTS, taking the time to learn it, and then use it: http://sqlite.org/fts3.html.

    这篇关于SQLite 表需要时间来获取 LIKE 查询中的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆