是否可以进行 MongoDB 批量更新插入?C# 驱动程序 [英] Is a MongoDB bulk upsert possible? C# Driver
问题描述
我想在 Mongo 中进行批量更新插入.基本上我从供应商那里得到了一个对象列表,但我不知道我以前得到过哪些(需要更新)与哪些是新的.我可以一个接一个地执行更新插入,但 UpdateMany 不能使用更新插入选项.
I'd like to do a bulk upsert in Mongo. Basically I'm getting a list of objects from a vendor, but I don't know which ones I've gotten before (and need to be updated) vs which ones are new. One by one I could do an upsert, but UpdateMany doesn't work with upsert options.
所以我选择了文档,在 C# 中更新,然后进行批量插入.
So I've resorted to selecting the documents, updating in C#, and doing a bulk insert.
public async Task BulkUpsertData(List<MyObject> newUpsertDatas)
{
var usernames = newUpsertDatas.Select(p => p.Username);
var filter = Builders<MyObject>.Filter.In(p => p.Username, usernames);
//Find all records that are in the list of newUpsertDatas (these need to be updated)
var collection = Db.GetCollection<MyObject>("MyCollection");
var existingDatas = await collection.Find(filter).ToListAsync();
//loop through all of the new data,
foreach (var newUpsertData in newUpsertDatas)
{
//and find the matching existing data
var existingData = existingDatas.FirstOrDefault(p => p.Id == newUpsertData.Id);
//If there is existing data, preserve the date created (there are other fields I preserve)
if (existingData == null)
{
newUpsertData.DateCreated = DateTime.Now;
}
else
{
newUpsertData.Id = existingData.Id;
newUpsertData.DateCreated = existingData.DateCreated;
}
}
await collection.DeleteManyAsync(filter);
await collection.InsertManyAsync(newUpsertDatas);
}
有没有更有效的方法来做到这一点?
Is there a more efficient way to do this?
我做了一些速度测试.
在准备过程中,我插入了一个非常简单的对象的 100,000 条记录.然后我将 200,000 条记录更新到集合中.
In preparation I inserted 100,000 records of a pretty simple object. Then I upserted 200,000 records into the collection.
方法 1 如问题中所述.SelectMany,在代码中更新,DeleteMany,InsertMany.这大约需要 5 秒钟.
Method 1 is as outlined in the question. SelectMany, update in code, DeleteMany, InsertMany. This took approximately 5 seconds.
方法 2 使用 Upsert = true 制作 UpdateOneModel 列表,然后执行 BulkWriteAsync.这是超级慢.我可以看到 mongo 集合中的计数增加,所以我知道它正在工作.但大约 5 分钟后,它只攀升至 107,000,所以我取消了它.
Method 2 was making a list of UpdateOneModel with Upsert = true and then doing one BulkWriteAsync. This was super slow. I could see the count in the mongo collection increasing so I know it was working. But after about 5 minutes it had only climbed to 107,000 so I canceled it.
如果其他人有潜在的解决方案,我仍然很感兴趣
I'm still interested if anyone else has a potential solution
推荐答案
鉴于你已经说过你可以做一个一对一的 upsert,你可以用 BulkWriteAsync
.这允许您创建抽象 WriteModel代码>
,在您的情况下将是 的实例UpdateOneModel
.
Given that you've said you could do a one-by-one upsert, you can achieve what you want with BulkWriteAsync
. This allows you to create one or more instances of the abstract WriteModel
, which in your case would be instances of UpdateOneModel
.
为了实现这一点,您可以执行以下操作:
In order to achieve this, you could do something like the following:
var listOfUpdateModels = new List<UpdateOneModel<T>>();
// ...
var updateOneModel = new UpdateOneModel<T>(
Builders<T>.Filter. /* etc. */,
Builders<T>.Update. /* etc. */)
{
IsUpsert = true;
};
listOfUpdateModels.Add(updateOneModel);
// ...
await mongoCollection.BulkWriteAsync(listOfUpdateModels);
这一切的关键是 IsUpsert
UpdateOneModel
上的属性.
The key to all of this is the IsUpsert
property on UpdateOneModel
.
这篇关于是否可以进行 MongoDB 批量更新插入?C# 驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!