MongoDB全文搜索不使用索引(MongoDB fulltext search not using index)

389 IT屋

We use mongoDB fulltext search to find products in our database. Unfortunately it is incredible slow. The collection contains 89.114.052 documents and I have the suspicion, that the full text index is not used. Performing a search with explain(), nscannedObjects returns 133212. Shouldn't this be 0 if an index is used?

My index:

    "v" : 1,
    "key" : {
        "_fts" : "text",
        "_ftsx" : 1
    "name" : "textIndex",
    "ns" : "search.products",
    "weights" : {
        "brand" : 1,
        "desc" : 1,
        "ean" : 1,
        "name" : 3,
        "shop_product_number" : 1
    "default_language" : "german",
    "background" : false,
    "language_override" : "language",
    "textIndexVersion" : 2

The complete test search:

> db.products.find({ $text: { $search: "playstation" } }).limit(100).explain()
    "cursor" : "TextCursor",
    "n" : 100,
    "nscannedObjects" : 133212,
    "nscanned" : 133212,
    "nscannedObjectsAllPlans" : 133212,
    "nscannedAllPlans" : 133212,
    "scanAndOrder" : false,
    "nYields" : 1041,
    "nChunkSkips" : 0,
    "millis" : 105,
    "server" : "search2:27017",
    "filterSet" : false

Please have a look at the question you asked:

".... The collection contains 89.114.052 documents and I have the suspicion, that the full text index is not used ...."

You are only "nScanned" for 133212 documents. Of course the index is used. If it was not then 89,114,052 documents ( because this is English locale and not German ) would have otherwise been reported in "nScanned" which means an index is not used.

Your query is slow. Well it seems your hardware is not up to the task of keeping 1333212 documents in memory or otherwise having the super fast disk to "page" effectively. But this is not a MongoDB problem but yours.

You have over 100,000 documents that match your query and even if you just want 100 then you need to accept this is how this works and MongoDB does not "give up" once you have matched 100 documents and yield control. The query pattern here finds all of the matches and then applies the "limit" to the cursor in order just to return the most recent.

Maybe some time in the future the "text" functionality might allow you do do things like you can do in the aggregate version of $geoNear and specify "minimum" and "maximum" values for a "score" in order to improve results. But right now it does not.

So either upgrade your hardware or use an external text search solution if your problem is the slow results on matching over 100,000 documents out of over 89,000,000 documents.



"_fts ":"text",

" textIndexVersion":2


 > db.products.find({$ text:{$ search:"playstation"}})。limit(100).explain()




你只是" n扫描"为133212文件。当然,这个指数被使用了。如果不是那么89,114,052文件(因为这是英文区域设置而不是德文文件)会在"nScanned"中报告,这意味着没有使用索引。



也许在将来的某个时候,"文本"功能可能会让你做一些事情,比如你可以在 $ geoNear 并指定"score"中的"minimum"和"maximum"值为了改善结果。但是现在它不会。


本文地址:IT屋 » MongoDB全文搜索不使用索引