如何使用EF Core在转换后的SQL中获得COUNT DISTINCT [英] How to get COUNT DISTINCT in translated SQL with EF Core

查看:167
本文介绍了如何使用EF Core在转换后的SQL中获得COUNT DISTINCT的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想让EF核心将 .Select(x => x.property).Distinct().Count()转换为类似

I want to have EF core translate .Select(x=>x.property).Distinct().Count() into something like

SELECT COUNT(DISTINCT property)

让我们举个例子.假设我有一个数据库表,其中包含PersonID(long),VisitStart(datetime2)和VisitEnd(datetime2).如果我想获取特定人员访问的不同天数,则可以编写类似

Let's take an example. Let's say I have a DB table with PersonID(long), VisitStart(datetime2) and VisitEnd(datetime2). If i want to get the number of distinct days a particular person has visited, then I could write SQL like

SELECT COUNT(DISTINCT CONVERT(date, VisitStart)) FROM myTable GROUP BY PersonID

但是使用EF核心和

MyTable
    .GroupBy(x=>x.PersonID)
    .Select(x=> new 
    {
        Count = x.Select(y=>y.VisitStart.Date).Distinct().Count()
    })

提供正确结果的

会转换为该SQL

which gives the right results, translates into this SQL

SELECT [x].[PersonID], [x].[VisitStart], [x].[VisitEnd]
FROM [myTable] as [x]
ORDER BY [x].[PersonID]

在任何地方都没有GROUP BY且没有DISTINCT或COUNT,因此必须在内存中进行分组,这在对具有数百万条可能要从数据库中提取的记录的表进行操作时并不理想.

There is no GROUP BY and no DISTINCT or COUNT anywhere so the grouping must be done in memory, which is not ideal when operating on a table that has millions of records that potentially has to be pulled from DB.

所以任何人都知道如何获得EF核心来将 .Select(...).Distinct().Count()转换为 SELECT COUNT(DISTINCT ...)

So anyone know how to get EF core to translate a .Select(...).Distinct().Count() into SELECT COUNT(DISTINCT ...)

推荐答案

更新(EF Core 5.x):

从版本5.0开始,表达式 Select(expr).Distinct().Count()现在已被EF Core识别并转换为相应的SQL COUNT(DISTINCT expr)),因此无需修改即可使用原始LINQ查询.

Starting with version 5.0, expression Select(expr).Distinct().Count() is now recognized by EF Core and translated to the corresponding SQL COUNT(DISTINCT expr)), hence the original LINQ query can be used w/o modification.

原始版本(EF Core 2.x),由于查询管道重写,该解决方案不适用于EF Core 3.x:

EF(6和Core)历史上不支持此标准SQL构造.最可能的原因是缺少标准的LINQ方法以及将 Select(expr).Distinct().Count()映射到它的技术困难.

EF (6 and Core) historically does not support this standard SQL construct. Most likely because of the lack of standard LINQ method and technical difficulties of mapping Select(expr).Distinct().Count() to it.

好处是,EF Core可扩展,方法是使用自定义派生的实现替换其许多内部服务,以覆盖所需的行为.不容易,需要很多管道代码,但是可行.

The good thing is that EF Core is extendable by replacing many of its internal services with custom derived implementations to override the required behaviors. Not easy, requires a lot of plumbing code, but doable.

因此,想法是添加并使用像这样的简单自定义 CountDistinct 方法

So the idea is to add and use simple custom CountDistinct methods like this

public static int CountDistinct<T, TKey>(this IQueryable<T> source, Expression<Func<T, TKey>> keySelector)
    => source.Select(keySelector).Distinct().Count();

public static int CountDistinct<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
    => source.Select(keySelector).Distinct().Count();

,然后让EF Core以某种方式将它们转换为SQL.实际上,EF Core提供了一种定义(甚至自定义转换)数据库标量函数的简单方法,但是不幸的是,它不能用于具有单独处理流水线的聚合函数.因此,我们需要深入研究EF Core基础架构.

and let EF Core somehow translate them to SQL. In fact EF Core provides a simple way of defining (and even custom translating) database scalar functions, but unfortunately this cannot be used for aggregate functions which have separate processing pipeline. So we need to dig deeply into EF Core infrastructure.

最后提供了EF Core 2.x管道的完整代码.不确定是否值得努力,因为EF Core 3.0将使用完整的重写查询流程管道.但这很有趣,而且我很确定可以为新的(希望更简单的)管道对其进行更新.

The full code for that for EF Core 2.x pipeline is provided at the end. Not sure if it's worth efforts because EF Core 3.0 will use complete rewritten query process pipeline. But it was interesting and also I'm pretty sure it can be updated for the new (hopefully simpler) pipeline.

无论如何,您所需要做的就是将代码复制/粘贴到项目中的新代码文件中,并将以下内容添加到上下文中 OnConfiguring override

Anyway, all you need is to copy/paste the code into a new code file in the project, add the following to the context OnConfiguring override

optionsBuilder.UseCustomExtensions();

这会将功能插入EF Core基础架构,然后像这样查询

which will plug the functionality into EF Core infrastructure, and then query like this

var result = db.MyTable
    .GroupBy(x => x.PersonID, x => new { VisitStartDate = x.VisitStart.Date })
    .Select(g => new
    {
        Count = g.CountDistinct(x => x.VisitStartDate)
    }).ToList();

将幸运地翻译为所需的

SELECT COUNT(DISTINCT(CONVERT(date, [x].[VisitStart]))) AS [Count]
FROM [MyTable] AS [x]
GROUP BY [x].[PersonID]

请注意预先选择聚合方法所需的表达式.这是当前针对所有聚合方法(不仅是我们的方法)的EF Core限制/要求.

Note the preselecting the expression needed for aggregate method. This is current EF Core limitation/requirement for all aggregate methods, not just ours.

最后,完成魔术的完整代码:

Finally, the full code that does the magic:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Internal;
using Microsoft.EntityFrameworkCore.Metadata;
using Microsoft.EntityFrameworkCore.Query;
using Microsoft.EntityFrameworkCore.Query.Expressions;
using Microsoft.EntityFrameworkCore.Query.ExpressionVisitors;
using Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal;
using Microsoft.EntityFrameworkCore.Query.Internal;
using Remotion.Linq;
using Remotion.Linq.Clauses;
using Remotion.Linq.Clauses.ResultOperators;
using Remotion.Linq.Clauses.StreamedData;
using Remotion.Linq.Parsing.Structure.IntermediateModel;

namespace Microsoft.EntityFrameworkCore
{
    public static partial class CustomExtensions
    {
        public static int CountDistinct<T, TKey>(this IQueryable<T> source, Expression<Func<T, TKey>> keySelector)
            => source.Select(keySelector).Distinct().Count();

        public static int CountDistinct<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
            => source.Select(keySelector).Distinct().Count();

        public static DbContextOptionsBuilder UseCustomExtensions(this DbContextOptionsBuilder optionsBuilder)
            => optionsBuilder
                .ReplaceService<INodeTypeProviderFactory, CustomNodeTypeProviderFactory>()
                .ReplaceService<IRelationalResultOperatorHandler, CustomRelationalResultOperatorHandler>();
    }
}

namespace Remotion.Linq.Parsing.Structure.IntermediateModel
{
    public sealed class CountDistinctExpressionNode : ResultOperatorExpressionNodeBase
    {
        public CountDistinctExpressionNode(MethodCallExpressionParseInfo parseInfo, LambdaExpression optionalSelector)
            : base(parseInfo, null, optionalSelector) { }
        public static IEnumerable<MethodInfo> GetSupportedMethods()
            => typeof(CustomExtensions).GetTypeInfo().GetDeclaredMethods("CountDistinct");
        public override Expression Resolve(ParameterExpression inputParameter, Expression expressionToBeResolved, ClauseGenerationContext clauseGenerationContext)
            => throw CreateResolveNotSupportedException();
        protected override ResultOperatorBase CreateResultOperator(ClauseGenerationContext clauseGenerationContext)
            => new CountDistinctResultOperator();
    }
}

namespace Remotion.Linq.Clauses.ResultOperators
{
    public sealed class CountDistinctResultOperator : ValueFromSequenceResultOperatorBase
    {
        public override ResultOperatorBase Clone(CloneContext cloneContext) => new CountDistinctResultOperator();
        public override StreamedValue ExecuteInMemory<T>(StreamedSequence input) => throw new NotSupportedException();
        public override IStreamedDataInfo GetOutputDataInfo(IStreamedDataInfo inputInfo) => new StreamedScalarValueInfo(typeof(int));
        public override string ToString() => "CountDistinct()";
        public override void TransformExpressions(Func<Expression, Expression> transformation) { }
    }
}

namespace Microsoft.EntityFrameworkCore.Query.Internal
{
    public class CustomNodeTypeProviderFactory : DefaultMethodInfoBasedNodeTypeRegistryFactory
    {
        public CustomNodeTypeProviderFactory()
            => RegisterMethods(CountDistinctExpressionNode.GetSupportedMethods(), typeof(CountDistinctExpressionNode));
    }

    public class CustomRelationalResultOperatorHandler : RelationalResultOperatorHandler
    {
        private static readonly ISet<Type> AggregateResultOperators = (ISet<Type>)
            typeof(RequiresMaterializationExpressionVisitor).GetField("_aggregateResultOperators", BindingFlags.NonPublic | BindingFlags.Static)
            .GetValue(null);

        static CustomRelationalResultOperatorHandler()
            => AggregateResultOperators.Add(typeof(CountDistinctResultOperator));

        public CustomRelationalResultOperatorHandler(IModel model, ISqlTranslatingExpressionVisitorFactory sqlTranslatingExpressionVisitorFactory, ISelectExpressionFactory selectExpressionFactory, IResultOperatorHandler resultOperatorHandler)
            : base(model, sqlTranslatingExpressionVisitorFactory, selectExpressionFactory, resultOperatorHandler)
        { }

        public override Expression HandleResultOperator(EntityQueryModelVisitor entityQueryModelVisitor, ResultOperatorBase resultOperator, QueryModel queryModel)
            => resultOperator is CountDistinctResultOperator ?
                HandleCountDistinct(entityQueryModelVisitor, resultOperator, queryModel) :
                base.HandleResultOperator(entityQueryModelVisitor, resultOperator, queryModel);

        private Expression HandleCountDistinct(EntityQueryModelVisitor entityQueryModelVisitor, ResultOperatorBase resultOperator, QueryModel queryModel)
        {
            var queryModelVisitor = (RelationalQueryModelVisitor)entityQueryModelVisitor;
            var selectExpression = queryModelVisitor.TryGetQuery(queryModel.MainFromClause);
            var inputType = queryModel.SelectClause.Selector.Type;
            if (CanEvalOnServer(queryModelVisitor)
                && selectExpression != null
                && selectExpression.Projection.Count == 1)
            {
                PrepareSelectExpressionForAggregate(selectExpression, queryModel);
                var expression = selectExpression.Projection[0];
                var subExpression = new SqlFunctionExpression(
                    "DISTINCT", inputType, new[] { expression.UnwrapAliasExpression() });
                selectExpression.SetProjectionExpression(new SqlFunctionExpression(
                    "COUNT", typeof(int), new[] { subExpression }));
                return new ResultTransformingExpressionVisitor<int>(
                    queryModelVisitor.QueryCompilationContext, false)
                    .Visit(queryModelVisitor.Expression);
            }
            else
            {
                queryModelVisitor.RequiresClientResultOperator = true;
                var typeArgs = new[] { inputType };
                var distinctCall = Expression.Call(
                    typeof(Enumerable), "Distinct", typeArgs,
                    queryModelVisitor.Expression);
                return Expression.Call(
                    typeof(Enumerable), "Count", typeArgs,
                    distinctCall);
            }
        }

        private static bool CanEvalOnServer(RelationalQueryModelVisitor queryModelVisitor) =>
            !queryModelVisitor.RequiresClientEval && !queryModelVisitor.RequiresClientSelectMany &&
            !queryModelVisitor.RequiresClientJoin && !queryModelVisitor.RequiresClientFilter &&
            !queryModelVisitor.RequiresClientOrderBy && !queryModelVisitor.RequiresClientResultOperator &&
            !queryModelVisitor.RequiresStreamingGroupResultOperator;
    }
}

这篇关于如何使用EF Core在转换后的SQL中获得COUNT DISTINCT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆