从词典列表更新数据库 [英] Updating database from a list of dictionaries

查看:42
本文介绍了从词典列表更新数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python中,我有一个字典列表.该列表称为members,每个成员都有一个唯一的id.例如,列表可能如下所示:

In Python, I have a list of dictionaries. The list is called members and each member has a unique id. For example, the list could look like this:

members = [{'id':1, 'val1':10, 'val2':11},
           {'id':2, 'val1':2, 'val2':34},
           {'id':3, 'val1':350, 'val2':9}]

我想用成员列表更新我的收藏集,并根据需要更新和插入新条目.

I want to update my collection with the list of members, updating and inserting new entries as necessary.

我需要遍历成员,还是有更快的方法?

Do I need to loop through the members, or is there a faster way?

这是我的尝试,它似乎可以满足我的要求,但是需要一段时间:

Here's my attempt, which seems to do what I want but takes a while:

for m in members: 
     collection.update_one( {'id':m['id']}, {'$set': m)}, upsert = True)

请注意,这需要用不同的值(即与id相对应的值)更新每个数据库条目.

Please note that this requires updating each db entry with a different value, namely the one corresponding to its id.

推荐答案

使用现代pymongo,您可以使用

With modern pymongo you can use .bulk_write() with the ReplaceOne bulk write operation, in your particular case, or an otherwise appropriate operation

from pymongo import MongoClient
from pymongo import ReplaceOne

client = MongoClient()

db = client.test

members = [
  { 'id': 1, 'val1': 10,  'val2': 11 },
  { 'id': 2, 'val1': 2,   'val2': 34 },
  { 'id': 3, 'val1': 350, 'val2': 9  }
]

db.testcol.bulk_write([
  ReplaceOne(
    { "id": m['id'] },
    m,
    upsert=True
  )
  for m in members
])

理想情况下,您不会从源列表"进行处理,而是读入一些外部流"来降低内存需求.以类似的方式,您只需为1000操作建立操作列表参数,然后调用

Ideally you would not be processing from a source "list" and instead reading in some external "stream" to keep memory requirements down. In a similar way you would just build up the operations list argument for say 1000 operations and then calling .bulk_write() to the server for only every 1000.

但是,重点是使用 .bulk_write() ,您一次只发送一个响应就发送批处理",而不是作为带有单独响应的单独请求发送,这会增加开销并花费时间.

But the whole point is that with .bulk_write() you are sending your "batch" all at once and with only one response, rather than as separate requests with separate responses, which creates overhead and takes time.

也使用此API方法实际上在下面使用批量API" 在受支持的服务器中,但是当服务器版本不支持批量"方法时,会降级为您单独呼叫.

Also using this API method actually uses the "Bulk API" underneath in supported servers, but degrades to making the individual calls for you when the server version does not support the "Bulk" methods.

这篇关于从词典列表更新数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆