MongoDB全文搜索不使用索引(MongoDB fulltext search not using index)

389 IT屋

We use mongoDB fulltext search to find products in our database. Unfortunately it is incredible slow. The collection contains 89.114.052 documents and I have the suspicion, that the full text index is not used. Performing a search with explain(), nscannedObjects returns 133212. Shouldn't this be 0 if an index is used?

My index:

{
    "v" : 1,
    "key" : {
        "_fts" : "text",
        "_ftsx" : 1
    },
    "name" : "textIndex",
    "ns" : "search.products",
    "weights" : {
        "brand" : 1,
        "desc" : 1,
        "ean" : 1,
        "name" : 3,
        "shop_product_number" : 1
    },
    "default_language" : "german",
    "background" : false,
    "language_override" : "language",
    "textIndexVersion" : 2
}

The complete test search:

> db.products.find({ $text: { $search: "playstation" } }).limit(100).explain()
{
    "cursor" : "TextCursor",
    "n" : 100,
    "nscannedObjects" : 133212,
    "nscanned" : 133212,
    "nscannedObjectsAllPlans" : 133212,
    "nscannedAllPlans" : 133212,
    "scanAndOrder" : false,
    "nYields" : 1041,
    "nChunkSkips" : 0,
    "millis" : 105,
    "server" : "search2:27017",
    "filterSet" : false
}
解决方案

Please have a look at the question you asked:

".... The collection contains 89.114.052 documents and I have the suspicion, that the full text index is not used ...."

You are only "nScanned" for 133212 documents. Of course the index is used. If it was not then 89,114,052 documents ( because this is English locale and not German ) would have otherwise been reported in "nScanned" which means an index is not used.

Your query is slow. Well it seems your hardware is not up to the task of keeping 1333212 documents in memory or otherwise having the super fast disk to "page" effectively. But this is not a MongoDB problem but yours.

You have over 100,000 documents that match your query and even if you just want 100 then you need to accept this is how this works and MongoDB does not "give up" once you have matched 100 documents and yield control. The query pattern here finds all of the matches and then applies the "limit" to the cursor in order just to return the most recent.

Maybe some time in the future the "text" functionality might allow you do do things like you can do in the aggregate version of $geoNear and specify "minimum" and "maximum" values for a "score" in order to improve results. But right now it does not.

So either upgrade your hardware or use an external text search solution if your problem is the slow results on matching over 100,000 documents out of over 89,000,000 documents.

我们使用mongoDB全文搜索在我们的数据库中查找产品。
不幸的是它令人难以置信的缓慢。
集合包含89.114.052文档,我怀疑全文索引没有被使用。
使用explain()执行搜索,nscannedObjects返回133212.
如果使用索引,这不应该是0吗?



我的索引:

  {
"v":1,
"key":{
"_fts ":"text",
"_ftsx":1
},
"name":"textIndex",
"ns":"search.products",
"权重":{
"brand":1,
"desc":1,
"ean":1,
"name":3,
"shop_product_number":1

"default_language":"german",
"background":false,
"language_override":"language",
" textIndexVersion":2
}


完整的测试搜索:

 > db.products.find({$ text:{$ search:"playstation"}})。limit(100).explain()
{
"cursor":"TextCursor",
"n":100,
"nscannedObjects":133212,
"nscanned":133212,
"nscannedObjectsAllPlans":133212,
"nscannedAllPlans":133212,
"scanAndOrder":false,
"nYields":1041,
"nChunkSkips":0,
"millis":105,
"server":"search2:27017",
"filterSet":false
}

解决方案

请看看您提出的问题:


"....集合包含89.114.052文档并且我怀疑全文索引没有被使用...."




你只是" n扫描"为133212文件。当然,这个指数被使用了。如果不是那么89,114,052文件(因为这是英文区域设置而不是德文文件)会在"nScanned"中报告,这意味着没有使用索引。



您的查询速度很慢。那么看起来你的硬件不能完成将1333212文档保存在内存中的任务,或者让超级快速磁盘有效地"页面"。但这不是MongoDB问题,而是您的。



您有超过100,000个符合您查询的文档,即使您只需要100个,那么您也需要接受这个问题作品和MongoDB不会"放弃",一旦你匹配100个文件和收益控制。这里的查询模式会查找所有的匹配项,然后将"limit"应用于游标,以便返回最近的。

也许在将来的某个时候,"文本"功能可能会让你做一些事情,比如你可以在 $ geoNear 并指定"score"中的"minimum"和"maximum"值为了改善结果。但是现在它不会。

因此,要么升级您的硬件,要么使用外部文本搜索解决方案,如果您的问题是将超过100,000个文档中的超过89,000,000个文件。


本文地址:IT屋 » MongoDB全文搜索不使用索引