插入mongodb(pymongo)时的效率 [英] Efficiency when inserting into mongodb (pymongo)

查看:332
本文介绍了插入mongodb(pymongo)时的效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了清楚起见更新:在插入/附加到上限集合时,我需要提供性能建议。我有两个python脚本运行:



(1)拖动游标。

  while WSHandler.cursor.alive:
try:
doc = WSHandler.cursor.next
self.render(doc)

(2)插入如下:

  def on_data(self,data):#Tweepy 
if(len(data)> 5):
data = json.loads(data)
coll.insert(data)#insert into mongodb
#print(coll.count())
#print(data)

它运行良好一段时间(50次插入/秒)。然后,20-60秒后,它绊倒,击中cpu屋顶(虽然它运行在20%之前),从来没有恢复。我的mongostats进行潜水(潜水如下所示)。



Mongostat输出:



CPU现在被执行插入的进程阻塞了(至少根据

当我使用 print(data)运行上面的Tweepy行时



< >而不是将它添加到db( coll.insert(data)),一切都运行在15%cpu罚款。



我在mongostats中看到的内容:




  • res 保持攀登。 (虽然堵塞可能发生在40米,以及在100米跑得很好。)

  • 冲洗似乎不会干扰。 >
  • 锁定%稳定在0.1%。



(我正在运行AWS microinstance; pymongo。)

解决方案

我建议在运行测试时使用mongostat。有很多事情可能是错误的,但mongostat会给你一个很好的迹象。



http://docs.mongodb .org / manual / reference / mongostat /



我要查看的前两项是锁定百分比和数据吞吐量。在专用机器上的合理吞吐量下,我通常在遭受任何降级之前每秒进入1000-2000个更新/插入。这是我已经合作过的几个大型生产部署的情况。


Updated for clarity: I need advice for performance when inserting/appending to a capped collection. I have two python scripts running:

(1) Tailing the cursor.

while WSHandler.cursor.alive:
        try:
            doc = WSHandler.cursor.next()
            self.render(doc)

(2) Inserting like so:

def on_data(self, data):                      #Tweepy
    if (len(data) > 5):
        data = json.loads(data)
        coll.insert(data)                     #insert into mongodb
        #print(coll.count())
        #print(data)

and it's running fine for a while (at 50 inserts/second). Then, after 20-60secs, it stumbles, hits the cpu roof (though it was running at 20% before), and never recovers. My mongostats take a dive (the dive is shown below).

Mongostat output:

The CPU is now choked, by the processes doing the insertion (at least according to htop).

When I run the Tweepy lines above with print(data) instead of adding it to db (coll.insert(data)), everything's running along fine at 15% cpu use.

What I see in mongostats:

  • res keeps climbing. (Though clogs may happen at 40m as well as run fine on 100m.)
  • flushes do not seem to interfere.
  • locked % is stable at 0.1%. Would this lead to clogging eventually?

(I'm running AWS microinstance; pymongo.)

解决方案

I would suggest using mongostat while running your tests. There are many things that could be wrong but mongostat will give you a good indication.

http://docs.mongodb.org/manual/reference/mongostat/

The first two things I would look at are the lock percentage and the data throughput. With reasonable throughput on dedicated machines I typically get into the 1000-2000 updates/inserts per second before suffering any degradation. This has been the case for several large production deployments I have worked with.

这篇关于插入mongodb(pymongo)时的效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆