实体框架异步操作需要十倍的时间才能完成 [英] Entity Framework async operation takes ten times as long to complete

查看:228
本文介绍了实体框架异步操作需要十倍的时间才能完成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用实体框架6来处理数据库的MVC站点,我一直在尝试改变它,以便所有的操作都可以作为异步控制器运行,并且对数据库的调用是作为异步对等体运行的(例如, ToListAsync()而不是ToList())



我遇到的问题是,只需将查询更改为异步,就会使它们变得非常慢。

以下代码从我的数据上下文中获取Album对象的集合,并将其转换为相当简单的数据库连接:

  //获取相册
var albums = await this.context.Albums
.Where(x => x.Artist。 ID == artist.ID)
.ToListAsync();

这是创建的SQL:

  exec sp_executesql N'SELECT 
[Extent1]。[ID] AS [ID],
[Extent1]。[URL] AS [ URL],
[Extent1]。[ASIN] AS [ASIN],
[Extent1]。[标题] AS [标题],
[Extent1]。[ReleaseDate] AS [ReleaseDate] ,
[Extent1]。[AccurateDay] AS [AccurateDay],
[Extent1]。[AccurateMonth] AS [AccurateMonth],
[Extent1]。[Type] AS [Type],
[Extent1]。[Track] AS [Tracks],
[Extent1]。[MainCredits] AS [MainCredits],
[Extent1]。[SupportingCredits] AS [SupportingCredits],
[Extent1]。[描述] AS [描述],
[Extent1]。[Image] AS [Image],
[Extent1]。[HasImage] AS [HasImage],
[ Extent1] [创建] AS [创建],
[Extent1]。[Artist_ID] AS [Artist_ID]
FROM [dbo]。[相册] AS [Extent1]
WHERE [Extent1] 。[Artist_ID] = @ p__linq__0',N'@ p__linq__0 int',@ p__linq__0 = 134

As瘦gs go,这不是一个非常复杂的查询,但SQL服务器运行几乎要花费6秒钟。 SQL Server Profiler报告为5742ms完成。



如果我将代码更改为:

  //获取相册
var albums = this.context.Albums
.Where(x => x.Artist.ID ==艺术家.ID)
.ToList();

然后生成完全相同的SQL,但根据SQL Server Profiler,这仅运行在474ms。 / p>

相册表中的数据库大约有3500行,这不是很多,而在Artist_ID列中有一个索引,所以应该是相当快。



我知道异步有开销,但让事情慢了十倍,似乎有点陡峭!我在哪里出错?

解决方案

我发现这个问题很有趣,特别是因为我使用 Async 到处都有Ado.Net和EF 6.我希望有人给出这个问题的解释,但并没有发生。所以我试图在我身边重现这个问题。我希望有些人会发现这个有趣的。



第一个好消息:我转载它:)和差异是巨大的。有一个因素8 ...





首先我怀疑有什么处理



首先,在第一种情况下,我们沿着完整的通话路径只有3500次点击,这里我们有118 371.此外,你必须想象我没有放置的所有同步电话屏幕截图...



其次,在第一种情况下,我们对方法,这里我们有2 050 210个电话!这是17倍以上...(在大型1Mb阵列的测试中,是160倍)



此外还有:




  • 120 000 任务创建实例

  • 727 519 联锁电话

  • 290 569 监视电话

  • 98 283 ExecutionContext 实例,264 481捕获

  • 208 733 SpinLock 呼叫



我的猜测是缓冲是以异步方式(而不是一个好的)进行的,并行任务试图从TDS读取数据。创建太多任务只是为了解析二进制数据。作为一个初步的结论,我们可以说Async是伟大的,EF6是伟大的,但EF6在当前实现中异步的用法增加了一个主要的开销,在性能方面,线程端和CPU侧($ code> ToList()案例中为12%CPU使用率, ToListAsync 案例为8到10倍的工作...我运行在一个旧的i7 920)。



在进行一些测试时,我正在考虑这篇文章再次,我注意到我想念的东西:



对于.Net 4.5中的新异步方法,它们的行为与同步方法完全相同,除了一个显着的异常:ReadAsync以非顺序模式。



什么?!!!



所以我扩展了我的基准,将Ado.Net包括在常规/异步调用中,而$ code > CommandBehavior.SequentialAccess / CommandBehavior.Default ,这里有一个惊喜! :





我们与Ado.Net有完全相同的行为! Facepalm ...



我的最终结论是:EF 6实现中有一个错误。当对包含的表进行异步调用时,应将 CommandBehavior 切换到 SequentialAccess 二进制(最大)列。在Ado.Net方面,创建太多任务的问题会减缓进程。 EF问题是它不应该使用Ado.Net。



现在,您知道,而不是使用EF6异步方法,您最好以常规非异步方式调用EF,然后使用 TaskCompletionSource< T> 以异步方式返回结果。



注意1:由于可耻的错误,我编辑了我的帖子.. ..我已经通过网络进行了第一次测试,而不是在本地进行测试,有限的带宽使得结果失真。以下是更新的结果。



注意2:我没有将测试扩展到其他用例(例如: nvarchar(max) code>有很多数据),但有可能发生相同的行为。



注意3:通常为 ToList () case,是12%的CPU(我的CPU的1/8 = 1逻辑内核)。对于 ToListAsync()案例,最常见的是20%,就好像Scheduler无法使用所有的Treads。这可能是由于太多的任务创建,或者可能是TDS解析器的瓶颈,我不知道...


I’ve got an MVC site that’s using Entity Framework 6 to handle the database, and I’ve been experimenting with changing it so that everything runs as async controllers and calls to the database are ran as their async counterparts (eg. ToListAsync() instead of ToList())

The problem I’m having is that simply changing my queries to async has caused them to be incredibly slow.

The following code gets a collection of "Album" objects from my data context and is translated to a fairly simple database join:

// Get the albums
var albums = await this.context.Albums
    .Where(x => x.Artist.ID == artist.ID)
    .ToListAsync();

Here’s the SQL that’s created:

exec sp_executesql N'SELECT 
[Extent1].[ID] AS [ID], 
[Extent1].[URL] AS [URL], 
[Extent1].[ASIN] AS [ASIN], 
[Extent1].[Title] AS [Title], 
[Extent1].[ReleaseDate] AS [ReleaseDate], 
[Extent1].[AccurateDay] AS [AccurateDay], 
[Extent1].[AccurateMonth] AS [AccurateMonth], 
[Extent1].[Type] AS [Type], 
[Extent1].[Tracks] AS [Tracks], 
[Extent1].[MainCredits] AS [MainCredits], 
[Extent1].[SupportingCredits] AS [SupportingCredits], 
[Extent1].[Description] AS [Description], 
[Extent1].[Image] AS [Image], 
[Extent1].[HasImage] AS [HasImage], 
[Extent1].[Created] AS [Created], 
[Extent1].[Artist_ID] AS [Artist_ID]
FROM [dbo].[Albums] AS [Extent1]
WHERE [Extent1].[Artist_ID] = @p__linq__0',N'@p__linq__0 int',@p__linq__0=134

As things go, it’s not a massively complicated query, but it’s taking almost 6 seconds for SQL server to run it. SQL Server Profiler reports it as taking 5742ms to complete.

If I change my code to:

// Get the albums
var albums = this.context.Albums
    .Where(x => x.Artist.ID == artist.ID)
    .ToList();

Then the exact same SQL is generated, yet this runs in just 474ms according to SQL Server Profiler.

The database has around 3500 rows in the "Albums" table, which isn’t really very many, and has an index on the "Artist_ID" column, so it should be pretty fast.

I know that async has overheads, but making things go ten times slower seems a bit steep to me! Where am I going wrong here?

解决方案

I found this question very interesting, especially since I'm using async everywhere with Ado.Net and EF 6. I was hoping someone to give an explanation for this question, but it doesn't happened. So I tried to reproduce this problem on my side. I hope some of you will find this interesting.

First good news : I reproduced it :) And the difference is enormous. With a factor 8 ...

First I was suspecting something dealing with CommandBehavior, since I read an interesting article about async with Ado, saying this :

"Since non-sequential access mode has to store the data for the entire row, it can cause issues if you are reading a large column from the server (such as varbinary(MAX), varchar(MAX), nvarchar(MAX) or XML)."

I was suspecting ToList() calls to be CommandBehavior.SequentialAccess and async ones to be CommandBehavior.Default (non-sequential, which can cause issues). So I downloaded EF6's sources, and put breakpoints everywhere (where CommandBehavior where used, of course).

Result : nothing. All the calls are made with CommandBehavior.Default .... So I tried to step into EF code to understand what happens... and.. ooouch... I never see such a delegating code, everything seems lazy executed...

So I tried to do some profiling to understand what happens...

And I think I have something...

Here's the model to create the table I benchmarked, with 3500 lines inside of it, and 256 Kb random data in each varbinary(MAX). (EF 6.1 - CodeFirst - CodePlex) :

public class TestContext : DbContext
{
    public TestContext()
        : base(@"Server=(localdb)\\v11.0;Integrated Security=true;Initial Catalog=BENCH") // Local instance
    {
    }
    public DbSet<TestItem> Items { get; set; }
}

public class TestItem
{
    public int ID { get; set; }
    public string Name { get; set; }
    public byte[] BinaryData { get; set; }
}

And here's the code I used to create the test data, and benchmark EF.

using (TestContext db = new TestContext())
{
    if (!db.Items.Any())
    {
        foreach (int i in Enumerable.Range(0, 3500)) // Fill 3500 lines
        {
            byte[] dummyData = new byte[1 << 18];  // with 256 Kbyte
            new Random().NextBytes(dummyData);
            db.Items.Add(new TestItem() { Name = i.ToString(), BinaryData = dummyData });
        }
        await db.SaveChangesAsync();
    }
}

using (TestContext db = new TestContext())  // EF Warm Up
{
    var warmItUp = db.Items.FirstOrDefault();
    warmItUp = await db.Items.FirstOrDefaultAsync();
}

Stopwatch watch = new Stopwatch();
using (TestContext db = new TestContext())
{
    watch.Start();
    var testRegular = db.Items.ToList();
    watch.Stop();
    Console.WriteLine("non async : " + watch.ElapsedMilliseconds);
}

using (TestContext db = new TestContext())
{
    watch.Restart();
    var testAsync = await db.Items.ToListAsync();
    watch.Stop();
    Console.WriteLine("async : " + watch.ElapsedMilliseconds);
}

using (var connection = new SqlConnection(CS))
{
    await connection.OpenAsync();
    using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
    {
        watch.Restart();
        List<TestItem> itemsWithAdo = new List<TestItem>();
        var reader = await cmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess);
        while (await reader.ReadAsync())
        {
            var item = new TestItem();
            item.ID = (int)reader[0];
            item.Name = (String)reader[1];
            item.BinaryData = (byte[])reader[2];
            itemsWithAdo.Add(item);
        }
        watch.Stop();
        Console.WriteLine("ExecuteReaderAsync SequentialAccess : " + watch.ElapsedMilliseconds);
    }
}

using (var connection = new SqlConnection(CS))
{
    await connection.OpenAsync();
    using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
    {
        watch.Restart();
        List<TestItem> itemsWithAdo = new List<TestItem>();
        var reader = await cmd.ExecuteReaderAsync(CommandBehavior.Default);
        while (await reader.ReadAsync())
        {
            var item = new TestItem();
            item.ID = (int)reader[0];
            item.Name = (String)reader[1];
            item.BinaryData = (byte[])reader[2];
            itemsWithAdo.Add(item);
        }
        watch.Stop();
        Console.WriteLine("ExecuteReaderAsync Default : " + watch.ElapsedMilliseconds);
    }
}

using (var connection = new SqlConnection(CS))
{
    await connection.OpenAsync();
    using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
    {
        watch.Restart();
        List<TestItem> itemsWithAdo = new List<TestItem>();
        var reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess);
        while (reader.Read())
        {
            var item = new TestItem();
            item.ID = (int)reader[0];
            item.Name = (String)reader[1];
            item.BinaryData = (byte[])reader[2];
            itemsWithAdo.Add(item);
        }
        watch.Stop();
        Console.WriteLine("ExecuteReader SequentialAccess : " + watch.ElapsedMilliseconds);
    }
}

using (var connection = new SqlConnection(CS))
{
    await connection.OpenAsync();
    using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
    {
        watch.Restart();
        List<TestItem> itemsWithAdo = new List<TestItem>();
        var reader = cmd.ExecuteReader(CommandBehavior.Default);
        while (reader.Read())
        {
            var item = new TestItem();
            item.ID = (int)reader[0];
            item.Name = (String)reader[1];
            item.BinaryData = (byte[])reader[2];
            itemsWithAdo.Add(item);
        }
        watch.Stop();
        Console.WriteLine("ExecuteReader Default : " + watch.ElapsedMilliseconds);
    }
}

For the regular EF call (.ToList()), the profiling seems "normal" and is easy to read :

Here we find the 8.4 seconds we have with the Stopwatch (profiling slow downs the perfs). We also find HitCount = 3500 along the call path, witch is consistent with the 3500 lines in the test. On the TDS parser side, things start to became worse since we read 118 353 calls on TryReadByteArray() method, which is were the buffering loop occurs. (an average 33.8 calls for each byte[] of 256kb)

For the async case, it's really really different.... First, the .ToListAsync() call is scheduled on the ThreadPool, and then awaited. Nothing amazing here. But, now, here's the async hell on the ThreadPool :

First, in the first case we were having just 3500 hit counts along the full call path, here we have 118 371. Moreover, you have to imagine all the synchronization calls I didn't put on the screenshoot...

Second, in the first case, we were having "just 118 353" calls to the TryReadByteArray() method, here we have 2 050 210 calls ! It's 17 times more... (on a test with large 1Mb array, it's 160 times more)

Moreover there are :

  • 120 000 Task instances created
  • 727 519 Interlocked calls
  • 290 569 Monitor calls
  • 98 283 ExecutionContext instances, with 264 481 Captures
  • 208 733 SpinLock calls

My guess is the buffering is made in an async way (and not a good one), with parallel Tasks trying to read data from the TDS. Too many Task are created just to parse the binary data.

As a preliminary conclusion, we can say Async is great, EF6 is great, but EF6's usages of async in it's current implementation adds a major overhead, on the performance side, the Threading side, and the CPU side (12% CPU usage in the ToList() case and 20% in the ToListAsync case for a 8 to 10 times longer work... I run it on an old i7 920).

While doings some tests, I was thinking about this article again and I notice something I miss :

"For the new asynchronous methods in .Net 4.5, their behavior is exactly the same as with the synchronous methods, except for one notable exception: ReadAsync in non-sequential mode."

What ?!!!

So I extend my benchmarks to include Ado.Net in regular / async call, and with CommandBehavior.SequentialAccess / CommandBehavior.Default, and here's a big surprise ! :

We have the exact same behavior with Ado.Net !!! Facepalm...

My definitive conclusion is : there's a bug in EF 6 implementation. It should toggle the CommandBehavior to SequentialAccess when an async call is made over a table containing a binary(max) column. The problem of creating too many Task, slowing down the process, is on the Ado.Net side. The EF problem is it don't uses Ado.Net as it should.

Now you know instead of using the EF6 async methods, you would better have to call EF in a regular non-async way, and then use a TaskCompletionSource<T> to return the result in an async way.

Note 1 : I edited my post because of a shameful error.... I've done my first test over the network, not locally, and the limited bandwidth have distorted the results. Here are the updated results.

Note 2 : I didn't extends my test to other uses cases (ex : nvarchar(max) with a lot of data), but there are chances the same behavior happens.

Note 3 : Something usual for the ToList() case, is the 12% CPU (1/8 of my CPU = 1 logical core). Something unusual is the maximum 20% for the ToListAsync() case, as if the Scheduler could not use all the Treads. It's probably due to the too many Task created, or maybe a bottleneck in TDS parser, I don't know...

这篇关于实体框架异步操作需要十倍的时间才能完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆