是否可以进行 MongoDB 批量更新插入?C# 驱动程序 [英] Is a MongoDB bulk upsert possible? C# Driver

查看:58
本文介绍了是否可以进行 MongoDB 批量更新插入?C# 驱动程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 Mongo 中进行批量更新插入.基本上我从供应商那里得到了一个对象列表,但我不知道我以前得到过哪些(需要更新)与哪些是新的.我可以一个接一个地执行更新插入,但 UpdateMany 不能使用更新插入选项.

I'd like to do a bulk upsert in Mongo. Basically I'm getting a list of objects from a vendor, but I don't know which ones I've gotten before (and need to be updated) vs which ones are new. One by one I could do an upsert, but UpdateMany doesn't work with upsert options.

所以我选择了文档,在 C# 中更新,然后进行批量插入.

So I've resorted to selecting the documents, updating in C#, and doing a bulk insert.

    public async Task BulkUpsertData(List<MyObject> newUpsertDatas)
    {
        var usernames = newUpsertDatas.Select(p => p.Username);
        var filter = Builders<MyObject>.Filter.In(p => p.Username, usernames);

        //Find all records that are in the list of newUpsertDatas (these need to be updated)
        var collection = Db.GetCollection<MyObject>("MyCollection");
        var existingDatas = await collection.Find(filter).ToListAsync();

        //loop through all of the new data, 
        foreach (var newUpsertData in newUpsertDatas)
        {
            //and find the matching existing data
            var existingData = existingDatas.FirstOrDefault(p => p.Id == newUpsertData.Id);
            //If there is existing data, preserve the date created (there are other fields I preserve)
            if (existingData == null)
            {
                newUpsertData.DateCreated = DateTime.Now;
            }
            else
            {
                newUpsertData.Id = existingData.Id;
                newUpsertData.DateCreated = existingData.DateCreated;
            }
        }

        await collection.DeleteManyAsync(filter);
        await collection.InsertManyAsync(newUpsertDatas);
    }

有没有更有效的方法来做到这一点?

Is there a more efficient way to do this?

我做了一些速度测试.

在准备过程中,我插入了一个非常简单的对象的 100,000 条记录.然后我将 200,000 条记录更新到集合中.

In preparation I inserted 100,000 records of a pretty simple object. Then I upserted 200,000 records into the collection.

方法 1 如问题中所述.SelectMany,在代码中更新,DeleteMany,InsertMany.这大约需要 5 秒钟.

Method 1 is as outlined in the question. SelectMany, update in code, DeleteMany, InsertMany. This took approximately 5 seconds.

方法 2 使用 Upsert = true 制作 UpdateOneModel 列表,然后执行 BulkWriteAsync.这是超级慢.我可以看到 mongo 集合中的计数增加,所以我知道它正在工作.但大约 5 分钟后,它只攀升至 107,000,所以我取消了它.

Method 2 was making a list of UpdateOneModel with Upsert = true and then doing one BulkWriteAsync. This was super slow. I could see the count in the mongo collection increasing so I know it was working. But after about 5 minutes it had only climbed to 107,000 so I canceled it.

如果其他人有潜在的解决方案,我仍然很感兴趣

I'm still interested if anyone else has a potential solution

推荐答案

鉴于你已经说过你可以做一个一对一的 upsert,你可以用 BulkWriteAsync.这允许您创建抽象 WriteModel,在您的情况下将是 的实例UpdateOneModel.

Given that you've said you could do a one-by-one upsert, you can achieve what you want with BulkWriteAsync. This allows you to create one or more instances of the abstract WriteModel, which in your case would be instances of UpdateOneModel.

为了实现这一点,您可以执行以下操作:

In order to achieve this, you could do something like the following:

var listOfUpdateModels = new List<UpdateOneModel<T>>();

// ...

var updateOneModel = new UpdateOneModel<T>(
    Builders<T>.Filter. /* etc. */,
    Builders<T>.Update. /* etc. */)
{
    IsUpsert = true;
};

listOfUpdateModels.Add(updateOneModel);

// ...

await mongoCollection.BulkWriteAsync(listOfUpdateModels);

这一切的关键是 IsUpsert UpdateOneModel 上的属性.

The key to all of this is the IsUpsert property on UpdateOneModel.

这篇关于是否可以进行 MongoDB 批量更新插入?C# 驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆