实体框架包含排序依据随机产生的重复数据 [英] Entity Framework Include OrderBy random generates duplicate data

查看:119
本文介绍了实体框架包含排序依据随机产生的重复数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我检索,包括一些儿童(通过.INCLUDE)数据库的项目清单,并责令随机,EF给了我一个意外的结果。我创建/克隆除了项目。

要更好地解释我自己,我创建了一个小而简单的EF codeFirst项目重现该问题。
首先我给你的code为这个项目。

项目

创建一个基本的MVC3项目,并通过添加的NuGet的EntityFramework.SqlServerCompact包。结果
这增加了以下软件包的最新版本:


  • 的EntityFramework V4.3.0

  • SqlServerCompact v4.0.8482.1

  • EntityFramework.SqlServerCompact v4.1.8482.2

  • WebActivator 1.5版

模型和的DbContext

使用System.Collections.Generic;
使用System.Data.Entity的;命名空间RandomWithInclude.Models
{
    公共类PeopleContext:的DbContext
    {
        公共DbSet<&人GT;人{搞定;组; }
        公共DbSet<地址>地址{搞定;组; }
    }    公共类Person
    {
        公众诠释ID {搞定;组; }
        公共字符串名称{;组; }        公共虚拟的ICollection<地址>地址{搞定;组; }
    }    公共类地址
    {
        公众诠释ID {搞定;组; }
        公共字符串AdressLine {搞定;组; }        公共虚拟人人{搞定;组; }
    }
}

的DB设置和种子数据:EF.SqlServerCompact.cs

使用System.Collections.Generic;
使用System.Data.Entity的;
使用System.Data.Entity.Infrastructure;
使用RandomWithInclude.Models;[装配:WebActivator preApplicationStartMethod(typeof运算(RandomWithInclude.App_Start.EF),开始)命名空间RandomWithInclude.App_Start
{
    公共静态类EF
    {
        公共静态无效的start()
        {
            Database.DefaultConnectionFactory =新SqlCeConnectionFactory(System.Data.SqlServerCe.4.0);
            Database.SetInitializer(新DbInitializer());
        }
    }
    公共类DbInitializer:DropCreateDatabaseAlways< PeopleContext>
    {
        保护覆盖无效的种子(PeopleContext上下文)
        {
            VAR地址1 =新的地址{AdressLine =街1号,城1};
            VAR地址2 =新地址{AdressLine =街2号,都市2};
            VAR地址3 =新的地址{AdressLine =街3号,城3};
            VAR的地址4 =新地址{AdressLine =街4号,城市4};
            VAR address5 =新地址{AdressLine =5街,城市5};
            context.Addresses.Add(地址1);
            context.Addresses.Add(地址2);
            context.Addresses.Add(地址3);
            context.Addresses.Add(地址4);
            context.Addresses.Add(address5);
            VAR PERSON1 =新的Person {名称=1人,地址=新的List<地址> {地址1,地址}};
            VAR PERSON2 =新的Person {名称=2人,地址=新的List<地址> {地址3}};
            VAR Person3可能=新的Person {名称=3人,地址=新的List<地址> {地址4,address5}};
            context.Persons.Add(PERSON1);
            context.Persons.Add(PERSON2);
            context.Persons.Add(Person3可能);
        }
    }
}

控制器:HomeController.cs

使用系统;
使用System.Data.Entity的;
使用System.Linq的;
使用System.Web.Mvc;
使用RandomWithInclude.Models;命名空间RandomWithInclude.Controllers
{
    公共类HomeController的:控制器
    {
        公众的ActionResult指数()
        {
            变种DB =新PeopleContext();
            变种人= db.Persons
                                .INCLUDE(P => p.Addresses)
                                .OrderBy(p值=> Guid.NewGuid());            返回查看(persons.ToList());
        }
    }
}

视图:Index.cshtml

@using RandomWithInclude.Models
@model的IList<&人GT;< UL>
    @foreach(VAR人模型)
    {
        <立GT;
            @ person.Name
        < /李>
    }
< / UL>

这应该是所有的,你的应用程序应该编译:)


问题

正如你可以看到,我们有2个简单的模型(Person和Address)和人可以有多个地址。结果
我们种子生成的数据库 3人和5个地址。结果
如果我们从数据库中的所有人员,包括地址和随机的结果,只是打印出这些人的名字,这就是它一切都错了。

因此​​,我有时会收到4人,有时是5,有时3,我希望3始终。结果
例如:


  • 人员1

  • 人员3

  • 人员1

  • 人员3

  • 2人

所以..它的复制/克隆的数据!而这还不是酷..结果
这似乎只是EF失去跟踪什么地址是该人的子女。

生成的SQL查询是这样的:

SELECT
    [PROJECT1]。[ID] AS [ID]
    [PROJECT1]。[名] AS [名]
    [PROJECT1]。[C2] AS [C1]
    [PROJECT1]。[ID1] AS [ID1]
    [PROJECT1]。[AdressLine] AS [AdressLine]
    [PROJECT1]。[PERSON_ID] AS [PERSON_ID]
FROM(SELECT
    NEWID()AS [C1]
    [Extent1]。[ID] AS [ID]
    [Extent1]。[名] AS [名]
    [Extent2]。[ID] AS [ID1]
    [Extent2]。[AdressLine] AS [AdressLine]
    [Extent2]。[PERSON_ID] AS [PERSON_ID]
    CASE WHEN([Extent2]。[ID] IS NULL)再投(NULL AS INT)ELSE 1 END AS [C2]
    FROM [人物] AS [Extent1]
    LEFT OUTER JOIN [地址]作为[Extent1] [Extent2]。[ID] = [Extent2]。[PERSON_ID]
)AS [PROJECT1]
ORDER BY [PROJECT1]。[C1] ASC,[PROJECT1]。[ID] ASC,[PROJECT1]。[C2] ASC

的变通办法


  1. 如果我删除 .INCLUDE(P => p.Addresses)从查询,一切顺利。但当然的地址没有加载并访问该集合将为数据库每次新的呼叫。

  2. 变种人= db.Persons.Include(P =&GT:
  3. 我可以先通过.OrderBy前刚刚加入.ToList()......像这样从数据库中获取数据,并随机后来; p.Addresses).ToList(。)排序依据(p => Guid.NewGuid());

没有任何人有它为什么发生这样的任何想法?结果
也许这是一个错误的SQL生成?


解决方案

TL;博士:这里有一个漏水的抽象。对我们来说,包含是一个简单的指令,贴的的东西在每个单独的返回一行。但EF的实施包含通过在客户端返回一整排的每个人地址组合,重新组装完成。排序方式挥发性值导致这些行成为洗牌,分崩离析的EF是依托组。


当我们看看 ToTraceString()此LINQ:

 变种人= c.People.Include(地址);
 //注:看不到排序依据!

我们看到

  SELECT
[PROJECT1]。[ID] AS [ID]
[PROJECT1]。[名] AS [名]
[PROJECT1]。[C1] AS [C1]
[PROJECT1]。[为Id1] AS [Id1的]
[PROJECT1]。[数据] AS [数据]
[PROJECT1]。[PERSONID] AS [PERSONID]
FROM(SELECT
    [Extent1]。[ID] AS [ID]
    [Extent1]。[名] AS [名]
    [Extent2]。[ID] AS [Id1的]
    [Extent2]。[PERSONID] AS [PERSONID]
    [Extent2]。[数据] AS [数据]
    CASE WHEN([Extent2]。[ID] IS NULL)再投(NULL AS INT)ELSE 1 END AS [C1]
    FROM [人] AS [Extent1]
    LEFT OUTER JOIN [地址] AS [Extent2] ON [Extent1]。[ID] = [Extent2]。[PERSONID]
)AS [PROJECT1]
ORDER BY [PROJECT1]。[ID] ASC,[PROJECT1] [C1] ASC

所以,我们得到 N 行对每个 A ,加上 1 行对每个 P 无任何 A 秒。

添加排序依据子句,但放东西到订单,通过在开始有序列:

 变种人= c.People.Include(地址)排序依据(P => Guid.NewGuid());

  SELECT
[PROJECT1]。[ID] AS [ID]
[PROJECT1]。[名] AS [名]
[PROJECT1]。[C2] AS [C1]
[PROJECT1]。[为Id1] AS [Id1的]
[PROJECT1]。[数据] AS [数据]
[PROJECT1]。[PERSONID] AS [PERSONID]
FROM(SELECT
    NEWID()AS [C1]
    [Extent1]。[ID] AS [ID]
    [Extent1]。[名] AS [名]
    [Extent2]。[ID] AS [Id1的]
    [Extent2]。[PERSONID] AS [PERSONID]
    [Extent2]。[数据] AS [数据]
    CASE WHEN([Extent2]。[ID] IS NULL)再投(NULL AS INT)ELSE 1 END AS [C2]
    FROM [人] AS [Extent1]
    LEFT OUTER JOIN [地址] AS [Extent2] ON [Extent1]。[ID] = [Extent2]。[PERSONID]
)AS [PROJECT1]
ORDER BY [PROJECT1]。[C1] ASC,[PROJECT1]。[ID] ASC,[PROJECT1]。[C2] ASC

所以你的情况,其中责令按东西不是一个 P 的属性,但反而易挥发,因此可作为不同的的相同 P 后,整个事情分崩离析不同的 PA 记录


我不知道在哪里上的工作-AS-预期~~~铸铁错误连续这种行为下降。但至少现在我们知道了。

When I retrieve a list of items from a database including some children (via .Include), and order the randomly, EF gives me an unexpected result.. I creates/clones addition items..

To explain myself better, I've created a small and simple EF CodeFirst project to reproduce the problem. First i shall give you the code for this project.

The project

Create a basic MVC3 project and add the EntityFramework.SqlServerCompact package via Nuget.
That adds the latest versions of the following packages:

  • EntityFramework v4.3.0
  • SqlServerCompact v4.0.8482.1
  • EntityFramework.SqlServerCompact v4.1.8482.2
  • WebActivator v1.5

The Models and DbContext

using System.Collections.Generic;
using System.Data.Entity;

namespace RandomWithInclude.Models
{
    public class PeopleContext : DbContext
    {
        public DbSet<Person> Persons { get; set; }
        public DbSet<Address> Addresses { get; set; }
    }

    public class Person
    {
        public int ID { get; set; }
        public string Name { get; set; }

        public virtual ICollection<Address> Addresses { get; set; }
    }

    public class Address
    {
        public int ID { get; set; }
        public string AdressLine { get; set; }

        public virtual Person Person { get; set; }
    }
}

The DB Setup and Seed data: EF.SqlServerCompact.cs

using System.Collections.Generic;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using RandomWithInclude.Models;

[assembly: WebActivator.PreApplicationStartMethod(typeof(RandomWithInclude.App_Start.EF), "Start")]

namespace RandomWithInclude.App_Start
{
    public static class EF
    {
        public static void Start()
        {
            Database.DefaultConnectionFactory = new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");
            Database.SetInitializer(new DbInitializer());
        }
    }
    public class DbInitializer : DropCreateDatabaseAlways<PeopleContext>
    {
        protected override void Seed(PeopleContext context)
        {
            var address1 = new Address {AdressLine = "Street 1, City 1"};
            var address2 = new Address {AdressLine = "Street 2, City 2"};
            var address3 = new Address {AdressLine = "Street 3, City 3"};
            var address4 = new Address {AdressLine = "Street 4, City 4"};
            var address5 = new Address {AdressLine = "Street 5, City 5"};
            context.Addresses.Add(address1);
            context.Addresses.Add(address2);
            context.Addresses.Add(address3);
            context.Addresses.Add(address4);
            context.Addresses.Add(address5);
            var person1 = new Person {Name = "Person 1", Addresses = new List<Address> {address1, address2}};
            var person2 = new Person {Name = "Person 2", Addresses = new List<Address> {address3}};
            var person3 = new Person {Name = "Person 3", Addresses = new List<Address> {address4, address5}};
            context.Persons.Add(person1);
            context.Persons.Add(person2);
            context.Persons.Add(person3);
        }
    }
}

The controller: HomeController.cs

using System;
using System.Data.Entity;
using System.Linq;
using System.Web.Mvc;
using RandomWithInclude.Models;

namespace RandomWithInclude.Controllers
{
    public class HomeController : Controller
    {
        public ActionResult Index()
        {
            var db = new PeopleContext();
            var persons = db.Persons
                                .Include(p => p.Addresses)
                                .OrderBy(p => Guid.NewGuid());

            return View(persons.ToList());
        }
    }
}

The View: Index.cshtml

@using RandomWithInclude.Models
@model IList<Person>

<ul>
    @foreach (var person in Model)
    {
        <li>
            @person.Name
        </li>
    }
</ul>

this should be all, and you application should compile :)


The problem

As you can see, we have 2 straightforward models (Person and Address) and Person can have multiple Addresses.
We seed the generated database 3 persons and 5 addresses.
If we get all the persons from the database, including the addresses and randomize the results and just print out the names of those persons, that's where it all goes wrong.

As a result, i sometimes get 4 persons, sometimes 5 and sometimes 3, and i expect 3. Always.
e.g.:

  • Person 1
  • Person 3
  • Person 1
  • Person 3
  • Person 2

So.. it's copying/cloning data! And that's not cool..
It just seems that EF looses track of what addresses are a child of which person..

The generated SQL query is this:

SELECT 
    [Project1].[ID] AS [ID], 
    [Project1].[Name] AS [Name], 
    [Project1].[C2] AS [C1], 
    [Project1].[ID1] AS [ID1], 
    [Project1].[AdressLine] AS [AdressLine], 
    [Project1].[Person_ID] AS [Person_ID]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[ID] AS [ID], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[ID] AS [ID1], 
    [Extent2].[AdressLine] AS [AdressLine], 
    [Extent2].[Person_ID] AS [Person_ID], 
    CASE WHEN ([Extent2].[ID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [People] AS [Extent1]
    LEFT OUTER JOIN [Addresses] AS [Extent2] ON [Extent1].[ID] = [Extent2].[Person_ID]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[ID] ASC, [Project1].[C2] ASC

Workarounds

  1. If i remove the .Include(p =>p.Addresses) from the query, everything goes fine. but of course the addresses aren't loaded and accessing that collection will make a new call to the database every time.
  2. I can first get the data from the database and randomize later by just adding a .ToList() before the .OrderBy.. like this: var persons = db.Persons.Include(p => p.Addresses).ToList().OrderBy(p => Guid.NewGuid());

Does anybody have any idea of why it is happening like this?
Might this be a bug in the SQL generation?

解决方案

tl;dr: There's a leaky abstraction here. To us, Include is a simple instruction to stick a collection of things onto each single returned Person row. But EF's implementation of Include is done by returning a whole row for each Person-Address combo, and reassembling at the client. Ordering by a volatile value causes those rows to become shuffled, breaking apart the Person groups that EF is relying on.


When we have a look at ToTraceString() for this LINQ:

 var people = c.People.Include("Addresses");
 // Note: no OrderBy in sight!

we see

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C1] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

So we get n rows for each A, plus 1 row for each P without any As.

Adding an OrderBy clause, however, puts the thing-to-order-by at the start of the ordered columns:

var people = c.People.Include("Addresses").OrderBy(p => Guid.NewGuid());

gives

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C2] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[Id] ASC, [Project1].[C2] ASC

So in your case, where the ordered-by-thing is not a property of a P, but is instead volatile, and therefore can be different for different P-A records of the same P, the whole thing falls apart.


I'm not sure where on the working-as-intended ~~~ cast-iron bug continuum this behaviour falls. But at least now we know about it.

这篇关于实体框架包含排序依据随机产生的重复数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆