极慢的EF启动-15分钟 [英] Extremely slow EF startup - 15 minutes

查看:102
本文介绍了极慢的EF启动-15分钟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

前一段时间,我创建了一个系统,用户可以在其中定义带有某些对象的自定义字段的类别。然后,每个对象都有基于其类别的FieldValue。以下类:

Some time ago I created a system, in which user can define categories with custom fileds for some objects. Then, each object has FieldValue based on its category. Classes below:

public class DbCategory
    {
        public int Id { get; set; }

        [Required]
        public string Name { get; set; }

        [Required]
        public TextDbField MainField { get; set; }
        public List<DbField> Fields { get; set; }
    }

 public class DbObject
    {
        public int Id { get; set; }
        public byte[] Bytes { get; set; }

        [Required]
        public DbCategory Category { get; set; }

        public TextDbFieldValue MainFieldValue { get; set; }
        public List<DbFieldValue> FieldsValues { get; set; }
    }

public abstract class DbField
    {
        public int Id { get; set; }

        [Required]
        public string Name { get; set; }

        [Required]
        public bool Required { get; set; }


    }


    public class IntegerDbField : DbField
    {
        public int? Minimum { get; set; }
        public int? Maximum { get; set; }
    }

    public class FloatDbField : DbField
    {
        public double? Minimum { get; set; }
        public double? Maximum { get; set; }

    }
//... few other types

  public abstract class DbFieldValue
    {
        [Key]
        public int Id { get; set; }
        [Required]
        public DbField Field { get; set; }

        [JsonIgnore]
        public abstract string Value { get; set; }
    }


    public class IntDbFieldValue : DbFieldValue
    {
        public int? IntValue { get; set; }

        public override string Value
        {
            get { return IntValue?.ToString(); }
            set
            {
                if (value == null) IntValue = null;
                else IntValue = int.Parse(value);
            }
        }
    }// and other FieldValue types

在我的开发机(i5、16bg ram和ssd驱动器)上,数据库(在SqlExpress中)具有4个类别,每个类别具有5-6个字段,10k记录,第一次查询大约需要15s。第一个查询是

On my dev machine (i5, 16bg ram and ssd drive), database (in SqlExpress) with 4 categories, each hasving 5-6 fields, 10k records, first query takes about 15s. This first query is

var result = db.Objects
     .Include(s => s.Category)
     .Include(s => s.Category.MainField)
     .Include(s => s.MainFieldValue.Field)
     .Include(s => s.FieldsValues.Select(f => f.Field))
     .Where(predicate ?? AlwaysTrue)
     .ToArray();

我这样做是为了将所有内容加载到内存中。然后,我处理缓存列表,然后将更改写入数据库。我这样做是因为用户可以对每个FieldValue使用过滤器执行搜索。事实证明,每次查询数据库都非常慢-这部分工作得很好。

I do that to load everything into memory. Then, I work on cached list and just write changes into database. I do that, because user can perform search with filter on each FieldValue. Querying database each time then proved to be much to slow - this part however works pretty well.

问题稍后出现。一些客户定义了6个类别,每个类别有20多个字段,并存储70k +记录,有时启动时间超过15分钟。之后,速度在5k和50k之间没有差别。

Problem occurs later. Some clients defined 6 categories with 20+ fields on each, and store 70k+ records, startup takes more than 15 minutes sometimes. After that, there is no difference in the speed between 5k and 50k.

每种提高EF代码的技术我发现首次启动时间主要考虑的是视图创建缓存,增强EF等,但是在这种情况下,添加更多功能后启动时间会增加记录,而不是更多的实体类型。

Every technique to improve EF Code First startup time I've found considers mostly view creation caching, ngening EF and so on, but in this case startup time grows after adding more records, not more entities types.

我意识到这是由架构的复杂性引起的,但是有什么方法可以加快速度吗?幸运的是,这是Windows服务,因此一旦启动,它将持续数周,但仍然可以。

I realise that that's caused by the complexity of schema, but is there some way to speed this up? Fortunately, this is Windows Service, so once it is started, it goes for weeks, but still.

我应该在首次加载时删除EF并以纯SQL格式进行操作?我应该分批这样做吗?我应该将EF更改为nHibernate吗?或者是其他东西?在执行此行的过程中,在虚拟服务器上,此程序使CPU最大化(不是SQL Server,而是我的应用程序)。

Should I drop EF for the first load and do it in pure SQL? Should I do this in batches? Should I change EF to nHibernate? Or something else? On virtualized servers during execution of this line, this program maxes out the CPU (not SQL server, but my application).

我尝试仅加载对象,然后稍后加载它们的属性。在小型数据库上,这要快一些(但不是很明显),但在大型数据库上则要慢一些。任何帮助都值得赞赏,即使答案是等一等。

I've tried loading objects only and then load their properties later. This was a bit faster (but not noticably) on small databases, but is even slower on bigger ones. Any help appreciated, even if the answer is "suck it up and wait".

推荐答案

我设法减少了总的启动时间由EF用这些技巧进行了3次:

I managed to reduce total start time cuased by EF 3 times with those tricks:


  1. 将框架更新到6.2并启用模型缓存

公共类CachingContextConfiguration:DbConfiguration
{
public CachingContextConfiguration()
{
SetModelStore(new DefaultDbModelStore(Directory.GetCurrentDirectory()));
}

public class CachingContextConfiguration : DbConfiguration { public CachingContextConfiguration() { SetModelStore(new DefaultDbModelStore(Directory.GetCurrentDirectory())); }

}

调用 ctx.Database。尽早从新线程显式初始化(​​)。仍然需要3-4秒的时间,但是由于它与其他事物同时发生,因此很有帮助。

Call ctx.Database.Initialize() explicitly from new thread, as early as possible. This still takes 3-4 seconds, but since it happens alongside other things, it helps a lot.

以合理的顺序将实体加载到EF缓存中。

Load entities into EF cache in reasonable order.

以前,我只是在Inlude之后写了Include,它转换为多个联接。我在一些博客文章中发现了一条经验法则,其中最多两个链式的Includes EF表现不错,但是每增加一个都会大大降低一切。我还发现了一个博客文章,其中显示了EF缓存:给定实体使用Include或Load加载后,它将自动置于适当的属性中(博客作者关于对象并集错误)。所以我这样做了:

Previously, I just wrote Include after Inlude, which translates into multiple joins. I found a "rule of thumb" on some blog posts, that up to two chained Includes EF performs rather well, but each more slows everything down massively. I also found a blog post, that showed EF caching: once given entity was loaded with Include or Load, it will be automatically put in proper property (blog author is wrong about union of objects). So I did this:

  using (var db = new MyContext())
            {
                db.Fields.Load();
                db.Categories.Include(c => c.MainField).Include(x => x.Fields).Load();
                db.FieldValues.Load();
                return db.Objects.Include(x => x.MainFieldValue.Field).ToArray();
            } 

这比问题中包含的数据获取速度快6倍。我认为,一旦先前加载了实体,EF引擎就不会为相关对象调用数据库,而只是从缓存中获取它们。

This is fetching data 6 times faster than includes from question. I think that once entities are previously loaded, EF engine does not call database for related objects, it just gets them from cache.


  1. 我还在上下文构造函数中添加了它:

  1. I also added this in my context constructor:

    Configuration.LazyLoadingEnabled = false;
    Configuration.ProxyCreationEnabled = false;


效果几乎不明显,但可能在庞大的数据集上发挥更大的作用。

Effects of that are barely noticable, but may play bigger role on huge data set.

我还看过罗恩·米勒(Rowan Miller)的EF Core演示,我将在下一版本中切换到它-在某些情况下,它比EF6快5-6倍。

I've also watched this presentation of EF Core by Rowan Miller and I will be switching to it on next release - in some cases it's 5-6 times faster than EF6.

希望这对某人有帮助

这篇关于极慢的EF启动-15分钟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆