实体框架大数据集,内存异常 [英] Entity framework large data set, out of memory exception

查看:135
本文介绍了实体框架大数据集,内存异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在工作一个非常大的数据集,大约有200万条记录。我有下面的代码,但是在它有大约三个批次之间,大约60万条记录之后,得到了一个内存不足的异常。我明白,当它循环遍历每个批处理实体框架的懒惰负载,然后尝试将完整的200万条记录建立到内存中。有没有办法卸载我已经处理的批次?

  ModelContext dbContext = new ModelContext(); 
IEnumerable< IEnumerable< Town>> towns = dbContext.Towns.OrderBy(t => t.TownID).Batch(200000);
foreach(var batch in towns)
{
SearchClient.Instance.IndexMany(batch,SearchClient.Instance.Settings.DefaultIndex,Town,new SimpleBulkParameters(){Refresh = false}) ;
}

注意:批处理方法来自此项目: https://code.google.com/p/morelinq/



搜索客户端是这样的: https://github.com/Mpdreamz/NEST



你有两个选项来处理这个


  1. 更新您的上下文每批

  2. 在查询中使用.AsNoTracking(),例如:

      IEnumerable< IEnumerable< Town>> towns = dbContext.Towns.AsNoTracking()。OrderBy(t => t.TownID).Batch(200000); 


这告诉EF不要保留副本改变检测您可以阅读更多关于AsNoTracking的功能和性能对我的博客的影响: http://blog.staticvoid.co.nz/2012/4/2/entity_framework_and_asnotracking


I am working the a very large data set, roughly 2 million records. I have the code below but get an out of memory exception after it has process around three batches, about 600,000 records. I understand that as it loops through each batch entity framework lazy loads, which is then trying to build up the full 2 million records into memory. Is there any way to unload the batch one I've processed it?

ModelContext dbContext = new ModelContext();
IEnumerable<IEnumerable<Town>> towns = dbContext.Towns.OrderBy(t => t.TownID).Batch(200000);
foreach (var batch in towns)
{
    SearchClient.Instance.IndexMany(batch, SearchClient.Instance.Settings.DefaultIndex, "Town", new SimpleBulkParameters() { Refresh = false });
}

Note: The Batch method comes from this project: https://code.google.com/p/morelinq/

The search client is this: https://github.com/Mpdreamz/NEST

解决方案

The issue is that when you get data from EF there are actually two copies of the data created, one which is returned to the user and a second which EF holds onto and uses for change detection (so that it can persist changes to the database). EF holds this second set for the lifetime of the context and its this set thats running you out of memory.

You have 2 options to deal with this

  1. renew your context each batch
  2. Use .AsNoTracking() in your query eg:

    IEnumerable<IEnumerable<Town>> towns = dbContext.Towns.AsNoTracking().OrderBy(t => t.TownID).Batch(200000);
    

this tells EF not to keep a copy for change detection. You can read a little more about what AsNoTracking does and the performance impacts of this on my blog: http://blog.staticvoid.co.nz/2012/4/2/entity_framework_and_asnotracking

这篇关于实体框架大数据集,内存异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆