MongoDB提取文档速度慢(已使用索引) [英] MongoDB Fetching documents slow (Indexing used)
问题描述
FETCH阶段是我查询中的限制因素.我一直在努力,似乎mongodb读取的内容远远超出了需要,并且没有充分利用带宽.
The FETCH-stage is the limiting factor in my queries. Ive been reaserching and it seems that mongodb is reading much more than it needs, and not utilize the bandwidth fully.
我的mongoDB-mongod实例似乎在单个查询中读取了很多内容. 在附有1个EBS io 5000Piops(100GB)SSD的AWS EC2 m4.xlarge上进行测试. 16 GB内存.
My mongoDB-mongod instance seems to be reading to much on a single query. Testing on AWS EC2 m4.xlarge with 1 EBS io 5000Piops attached (100GB) SSD. 16 GB ram.
- 该计算机仅包含用于测试目的的mongodb实例.
- 该数据库总共有大约60GB(在磁盘上)(多个集合).
- 在以下方案和查询中使用主要集合.
数据库统计
db.stats()
{
"db" : "database",
"collections" : 4,
"objects" : 406496932,
"avgObjSize" : 326.3196544642064,
"dataSize" : 132647938391,
"storageSize" : 55475830784,
"numExtents" : 0,
"indexes" : 5,
"indexSize" : 8940408832,
"ok" : 1
}
收藏摘要:
db.collection.stats() ->
{ "ns" : "database.[collection###]",
"count" : 367614513,
"size" : 121155225858,
"avgObjSize" : 329,
"storageSize" : 52052197376,
"capped" : false,
"wiredTiger" : {"Left empty"},
"nindexes" : 2,
"totalIndexSize" : 8131604480,
"indexSizes" : {
"_id_" : 4373012480,
"id_1_ts_-1" : 3758592000
},
"ok" : 1
查询:
db.[#######].find({ id : "######",
ts : {
"$gte" :
ISODate("2016-10-01T00:00:00.000Z"),
$lt :
ISODate("2016-10-07T02:00:00.000Z")
}}, {_id : 0,"u1"
:1,"u2":1,"u3":1,"eq1" :1 ,"eq2" : 1,"eq3": 1,"ts" :1});
解释结果:
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "database.[collection]",
"d" : false,
"parsedQuery" : {
"$and" : [
{
"id" : {
"$eq" : "#####ID#####"
}
},
{
"ts" : {
"$lt" : ISODate("2016-09-30T22:00:00Z")
}
},
{
"ts" : {
"$gte" : ISODate("2016-09-22T22:00:00Z")
}
}
]
},
"winningPlan" : {
"stage" : "PROJECTION",
"transformBy" : {
"_id" : 0,
"u1" : 1,
"u2" : 1,
"u3" : 1,
"eq1" : 1,
"eq2" : 1,
"eq3" : 1,
"ts" : 1
},
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"id" : 1,
"ts" : -1
},
"indexName" : "id_1_ts_-1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"id" : [
"[\"#####ID#####\", \"#####ID#####\"]"
],
"ts" : [
"(new Date(1475272800000), new Date(1474581600000)]"
]
}
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 676745,
"executionTimeMillis" : 170508,
"totalKeysExamined" : 676745,
"totalDocsExamined" : 676745,
"executionStages" : {
"stage" : "PROJECTION",
"nReturned" : 676745,
"executionTimeMillisEstimate" : 167820,
"works" : 676746,
"advanced" : 676745,
"needTime" : 0,
"needYield" : 0,
"saveState" : 8970,
"restoreState" : 8970,
"isEOF" : 1,
"invalidates" : 0,
"transformBy" : {
"_id" : 0,
"u1" : 1,
"u2" : 1,
"u3" : 1,
"eq1" : 1,
"eq2" : 1,
"eq3" : 1,
"ts" : 1
},
"inputStage" : {
"stage" : "FETCH",
"nReturned" : 676745,
"executionTimeMillisEstimate" : 166470,
"works" : 676746,
"advanced" : 676745,
"needTime" : 0,
"needYield" : 0,
"saveState" : 8970,
"restoreState" : 8970,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 676745,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 676745,
"executionTimeMillisEstimate" : 980,
"works" : 676746,
"advanced" : 676745,
"needTime" : 0,
"needYield" : 0,
"saveState" : 8970,
"restoreState" : 8970,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"id" : 1,
"ts" : -1
},
"indexName" : "id_1_ts_- 1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"id" : [
"[\"#####ID#####\", \"#####ID#####\"]"
],
"ts" : [
"(new Date(1475272800000), new Date(1474581600000)]"
]
},
"keysExamined" : 676745,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"allPlansExecution" : [ ]
},
"serverInfo" : {
"host" : "ip #########",
"port" : 27017,
"version" : "3.2.10",
"gitVersion" : "79d9b3ab5ce20f51c272b4411202710a082d0317"
},
"ok" : 1
}
正如我们在上面看到的,mongoDb使用索引. IXSCAN需要 980ms , FETCH 〜160000ms .
As we can see above mongoDb uses index. The IXSCAN takes 980ms and the FETCH ~160000ms.
如果我没有记错的话,整个读取应该是676746(nReturned)* 329(avgObjSize)字节=〜212 MB数据.
If im not mistaking the whole read should be 676746(nReturned) * 329(avgObjSize ) Bytes = ~212 MB of data.
我注意到,在 iostats ( http://linuxcommand.org/man_pages/iostat1.html )以下内容(/data/db位于xvdf上):
Ive notice that in the iostats (http://linuxcommand.org/man_pages/iostat1.html) the following(/data/db is on xvdf):
vg-cpu: %user %nice %system %iowait %steal %idle
0.27 0.00 0.00 21.35 0.13 78.25
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 0.00 1691.00 0.00 19.83 0.00 24.02 0.95 0.56 0.56 0.00 0.56 94.40
rMB/s表示〜20MB/s,并且在整个操作过程中(提取阶段)这是连续的.这意味着mongodb的读取速度为160 s * 20MB/s = 3 200 MB,比上面的200 MB大得多.
The rMB/s says ~20MB/s and this continious during the whole operation (fetch-stage). This means that mongodb is reading 160 s * 20MB/s = 3 200 MB, which is alot more than the 200 MB above.
内存:
free -m
total used free shared buffers cached
Mem: 16048 12629 3418 0 32 4071
-/+ buffers/cache: 8525 7522
Swap: 0 0
mongodb还没有使用预配置的5000 iops EBS,也没有承诺带宽吗? 仅使用〜1700次读取/秒,导致〜20MB/s.
Also mongodb not using the 5000 iops EBS provisioned, nor the bandwidth promised? Only using ~1700 Reads/Second resulting in ~20MB/s.
我已将预读更改为16KB.香港专业教育学院试图将日志和日志放置在另一个硬盘上.
Ive changed the read-ahead to 16KB. Ive tried placing journal and log at another HDD.
我无法弄清楚! 帮我. 拜托!
I cannot figure this out! Help me. Please!
推荐答案
我在提取35000个文档时遇到了相同的问题.为了解决这个问题,我使用了聚合函数(sakulstra:aggregate
),在我看来,它令人难以置信地提升了请求.结果格式显然不尽相同,但是它仍然很容易用于计算我需要的所有东西.
I encountered the same problem when I was fetching around 35000 documents. To solve it, I used the aggregate function (sakulstra:aggregate
) and in my case it has incredibly boosted the request. The result format is obviously not the same, but it's still easy to use to compute all things I need.
之前(7000毫秒):
Before (7000ms) :
const historicalAssetAttributes = HistoricalAssetAttributes.find({
date:{'$gte':startDate,'$lte':endDate},
assetId: {$in: assetIds}
}, {
fields:{
"date":1,
"assetId":1,
"close":1
}
}).fetch();
之后(300毫秒):
const historicalAssetAttributes = HistoricalAssetAttributes.aggregate([
{
'$match': {
date: {'$gte': startDate, '$lte': endDate},
assetId: {$in: assetIds}
}
}, {
'$group':{
_id: {assetId: "$assetId"},
close: {
'$push': {
date: "$date",
value: "$close"
}
}
}
}
]);
这篇关于MongoDB提取文档速度慢(已使用索引)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!