关于将传递表达式作为方法参数进行优化的建议 [英] Suggestions for optimizing passing expressions as method parameters

查看:63
本文介绍了关于将传递表达式作为方法参数进行优化的建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我非常喜欢使用lambda表达式代替字符串来表示例如ORM映射中的属性的相对较新的趋势.强类型>>>>字符串类型.

要清楚,这就是我在说的:

builder.Entity<WebserviceAccount>()
    .HasTableName( "webservice_accounts" )
    .HasPrimaryKey( _ => _.Id )
    .Property( _ => _.Id ).HasColumnName( "id" )
    .Property( _ => _.Username ).HasColumnName( "Username" ).HasLength( 255 )
    .Property( _ => _.Password ).HasColumnName( "Password" ).HasLength( 255 )
    .Property( _ => _.Active ).HasColumnName( "Active" );

在我最近正在做的一些工作中,我需要根据表达式缓存内容,为此,我需要根据表达式创建密钥.像这样:

static string GetExprKey( Expression<Func<Bar,int>> expr )
{
    string key = "";
    Expression e = expr.Body;

    while( e.NodeType == ExpressionType.MemberAccess )
    {
        var me = (MemberExpression)e;
        key += "<" + (me.Member as PropertyInfo).Name;
        e = me.Expression;
    }

    key += ":" + ((ParameterExpression)e).Type.Name;

    return key;
}

注意:StringBuilder版本的性能几乎相同.它仅适用于格式为x => x.A.B.C的表达式,其他所有内容都是错误,应该会失败.是的,我需要缓存.不,在我看来,编译要比密钥生成/比较慢得多.

在对各种keygen函数进行基准测试时,我被神秘地发现它们都表现得很差.
即使是刚刚返回""的虚拟版本.

经过一番摸索,我发现实际上是Expression对象的实例化非常昂贵.

以下是我为衡量此效果而创建的新基准的输出:

Dummy( _ => _.F.Val ) 4106,5036 ms, 0,0041065036 ms/iter
Dummy( cachedExpr ) 0,3599 ms, 3,599E-07 ms/iter
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) 2,3127 ms, 2,3127E-06 ms/iter

这是基准测试的代码:

using System;
using System.Diagnostics;
using System.Linq.Expressions;

namespace ExprBench
{
    sealed class Foo
    {
        public int Val { get; set; }
    }

    sealed class Bar
    {
        public Foo F { get; set; }
    }


    public static class ExprBench
    {
        static string Dummy( Expression<Func<Bar, int>> expr )
        {
            return "";
        }

        static Expression<Func<Bar, int>> Bar_Foo_Val;

        static public void Run()
        {
            var sw = Stopwatch.StartNew();
            TimeSpan elapsed;

            int iterationCount = 1000000;

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( _ => _.F.Val );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( _ => _.F.Val ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            Expression<Func<Bar, int>> cachedExpr = _ => _.F.Val;
            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( cachedExpr );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( cachedExpr ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
        }
    }
}

这清楚地表明,通过一些简单的缓存就可以实现2000-10000倍的加速.

问题在于,这些变通办法在不同程度上损害了以这种方式使用表达式的美观性和安全性.

第二种解决方法至少使表达式保持内联,但这远非美观,

所以问题是,还有其他我可能会错过的变通方法吗,这些变通方法并不那么丑陋?

预先感谢

解决方案

考虑了一段时间的属性静态缓存之后,我想到了:

在这种特殊情况下,我感兴趣的所有属性表达式都位于简单的POCO DB实体上.因此,我决定将这些类设为局部类,并将静态缓存属性添加到另一个局部对类中.

看到这种方法可行,我决定尝试使其自动化.我看着T4,但似乎不适合这个目的.相反,我尝试了 https://github.com/daveaglick/Scripty ,它非常棒./p>

这是我用来生成缓存类的脚本:

using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Scripty.Core;
using System.Linq;
using System.Threading.Tasks;

bool IsInternalOrPublicSetter( AccessorDeclarationSyntax a )
{
    return a.Kind() == SyntaxKind.SetAccessorDeclaration &&
        a.Modifiers.Any( m => m.Kind() == SyntaxKind.PublicKeyword || m.Kind() == SyntaxKind.InternalKeyword );
}


foreach( var document in Context.Project.Analysis.Documents )
{
    // Get all partial classes that inherit from IIsUpdatable
    var allClasses = (await document.GetSyntaxRootAsync())
                    .DescendantNodes().OfType<ClassDeclarationSyntax>()
                    .Where( cls => cls.BaseList?.ChildNodes()?.SelectMany( _ => _.ChildNodes()?.OfType<IdentifierNameSyntax>() ).Select( id => id.Identifier.Text ).Contains( "IIsUpdatable" ) ?? false)
                    .Where( cls => cls.Modifiers.Any( m => m.ValueText == "partial" ))
                    .ToList();


    foreach( var cls in allClasses )
    {
        var curFile = $"{cls.Identifier}Exprs.cs";
        Output[curFile].WriteLine( $@"using System;
using System.Linq.Expressions;

namespace SomeNS
{{
    public partial class {cls.Identifier}
    {{" );
        // Get all properties with public or internal setter
        var props = cls.Members.OfType<PropertyDeclarationSyntax>().Where( prop => prop.AccessorList.Accessors.Any( IsInternalOrPublicSetter ) );
        foreach( var prop in props )
        {
            Output[curFile].WriteLine( $"        public static Expression<Func<{cls.Identifier},object>> {prop.Identifier}Expr = _ => _.{prop.Identifier};" );
        }

        Output[curFile].WriteLine( @"    }
}" );
    }

}

输入类可能看起来像这样:

public partial class SomeClass
{
    public string Foo { get; internal set; }
}

该脚本然后生成一个名为SomeClassExprs.cs的文件,其内容如下:

using System;
using System.Linq.Expressions;

namespace SomeNS
{
    public partial class SomeClassExprs
    {
        public static Expression<Func<SomeClass,object>> FooExpr = _ => _.Foo;
    }
}

文件在名为codegen的文件夹中生成,我从源代码管理中排除了该文件.

Scripty确保在编译过程中包括文件.

总的来说,我对这种方法感到非常满意.

:)

I'm a great fan of the relatively recent trend of using lambda expressions instead of strings for indicating properties in, for instance, ORM mapping. Strongly typed >>>> Stringly typed.

To be clear, this is what I'm talking about:

builder.Entity<WebserviceAccount>()
    .HasTableName( "webservice_accounts" )
    .HasPrimaryKey( _ => _.Id )
    .Property( _ => _.Id ).HasColumnName( "id" )
    .Property( _ => _.Username ).HasColumnName( "Username" ).HasLength( 255 )
    .Property( _ => _.Password ).HasColumnName( "Password" ).HasLength( 255 )
    .Property( _ => _.Active ).HasColumnName( "Active" );

In some recent work I've been doing, I have a need for caching stuff based on the expression and to do that, I needed to create a key based on the expression. Like so:

static string GetExprKey( Expression<Func<Bar,int>> expr )
{
    string key = "";
    Expression e = expr.Body;

    while( e.NodeType == ExpressionType.MemberAccess )
    {
        var me = (MemberExpression)e;
        key += "<" + (me.Member as PropertyInfo).Name;
        e = me.Expression;
    }

    key += ":" + ((ParameterExpression)e).Type.Name;

    return key;
}

Notes: The StringBuilder version performs almost identically. It is only supposed to work for expressions that have the form x => x.A.B.C, anything else is an error and should fail. Yes I need to cache. No, compilation is much slower than key generation/comparison in my case.

While benchmarking various keygen functions, I was mystified to discover that they all performed horribly.
Even the dummy version that just returned "".

After some fidling, I discovered that it was the actually the instantiation of the Expression object that was super expensive.

Here is the output of the new benchmark I created to measure this effect:

Dummy( _ => _.F.Val ) 4106,5036 ms, 0,0041065036 ms/iter
Dummy( cachedExpr ) 0,3599 ms, 3,599E-07 ms/iter
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) 2,3127 ms, 2,3127E-06 ms/iter

And here is the code for the benchmark:

using System;
using System.Diagnostics;
using System.Linq.Expressions;

namespace ExprBench
{
    sealed class Foo
    {
        public int Val { get; set; }
    }

    sealed class Bar
    {
        public Foo F { get; set; }
    }


    public static class ExprBench
    {
        static string Dummy( Expression<Func<Bar, int>> expr )
        {
            return "";
        }

        static Expression<Func<Bar, int>> Bar_Foo_Val;

        static public void Run()
        {
            var sw = Stopwatch.StartNew();
            TimeSpan elapsed;

            int iterationCount = 1000000;

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( _ => _.F.Val );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( _ => _.F.Val ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            Expression<Func<Bar, int>> cachedExpr = _ => _.F.Val;
            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( cachedExpr );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( cachedExpr ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
        }
    }
}

This clearly demonstrates that a speedup of 2000-10000 times can be achieved with some simple caching.

The problem is, that these workarounds, to varying extent, compromises the beauty and safety of using expressions in this manner.

The second workaround at least keeps the expression inline, but it's far from pretty,

So the questions is, are there any other workarounds that I might have missed, which are less ugly?

Thanks in advance

解决方案

After thinking on the static caching of properties for a while I came up with this:

In this particular case all the property expressions I was interested in was on simple POCO DB entities. So I decided to make these classes partial and add the static cache properties in another partial pair class.

Having seen that this worked I decided to try and automate it. I looked at T4, but it didn't seem fit for this purpose. Instead I tried out https://github.com/daveaglick/Scripty, which is pretty awesome.

Here is the script I use to generate my caching classes:

using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Scripty.Core;
using System.Linq;
using System.Threading.Tasks;

bool IsInternalOrPublicSetter( AccessorDeclarationSyntax a )
{
    return a.Kind() == SyntaxKind.SetAccessorDeclaration &&
        a.Modifiers.Any( m => m.Kind() == SyntaxKind.PublicKeyword || m.Kind() == SyntaxKind.InternalKeyword );
}


foreach( var document in Context.Project.Analysis.Documents )
{
    // Get all partial classes that inherit from IIsUpdatable
    var allClasses = (await document.GetSyntaxRootAsync())
                    .DescendantNodes().OfType<ClassDeclarationSyntax>()
                    .Where( cls => cls.BaseList?.ChildNodes()?.SelectMany( _ => _.ChildNodes()?.OfType<IdentifierNameSyntax>() ).Select( id => id.Identifier.Text ).Contains( "IIsUpdatable" ) ?? false)
                    .Where( cls => cls.Modifiers.Any( m => m.ValueText == "partial" ))
                    .ToList();


    foreach( var cls in allClasses )
    {
        var curFile = $"{cls.Identifier}Exprs.cs";
        Output[curFile].WriteLine( $@"using System;
using System.Linq.Expressions;

namespace SomeNS
{{
    public partial class {cls.Identifier}
    {{" );
        // Get all properties with public or internal setter
        var props = cls.Members.OfType<PropertyDeclarationSyntax>().Where( prop => prop.AccessorList.Accessors.Any( IsInternalOrPublicSetter ) );
        foreach( var prop in props )
        {
            Output[curFile].WriteLine( $"        public static Expression<Func<{cls.Identifier},object>> {prop.Identifier}Expr = _ => _.{prop.Identifier};" );
        }

        Output[curFile].WriteLine( @"    }
}" );
    }

}

An input class could look like this:

public partial class SomeClass
{
    public string Foo { get; internal set; }
}

The script then generates a file named SomeClassExprs.cs, with the following content:

using System;
using System.Linq.Expressions;

namespace SomeNS
{
    public partial class SomeClassExprs
    {
        public static Expression<Func<SomeClass,object>> FooExpr = _ => _.Foo;
    }
}

The files are generated in a folder called codegen, which I exclude from source control.

Scripty makes sure to include the files during compilation.

All in all I'm very pleased with this approach.

:)

这篇关于关于将传递表达式作为方法参数进行优化的建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆