.NET中类装载器的等效 [英] Equivalent of Class Loaders in .NET

查看:247
本文介绍了.NET中类装载器的等效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有人知道是否可以在.NET中定义一个java自定义类加载器?



/ strong>



我正在开发一种新的编程语言,目标是CLR,称为Liberty。语言的一个特点是它定义类型构造器的能力,这是编译器在编译时执行的方法,并且生成类型作为输出。它们是泛型的泛化(该语言在其中有正常的泛型),并允许编写类似这样的代码(在Liberty语法中):

  var t as tuple< i as int,j as int,k as int> ;; 
t.i = 2;
t.j = 4;
t.k = 5;

其中tuple定义如下:

  public type tuple(params variables as VariableDeclaration [])as TypeDeclaration 
{
// ...
}

在这个特定的例子中,类型构造函数 tuple



但是,与匿名类型不同,tuples有名称,可以在公共方法签名中使用。



这意味着我需要一种最终最终由编译器发出的类型,以便在多个程序集之间共享。例如,我想要在组件A中定义的



tuple< x as int>

这个问题当然是,因为 tuple< x as int> 程序集A和程序集B将在不同的时间进行编译,这意味着它们最终都会发出自己不兼容的版本的元组类型。



我研究了使用某种类型的擦除来做这个,所以我会有一个类型的共享库(这是Liberty语法):

  class tuple< T> 
{
public Field1 as T;
}

class tuple< T,R>
{
public Field2 as T;
public Field2 as R;
}

,然后从i,j和k元组字段重定向访问 Field1 Field2 Field3



然而,这不是一个可行的选择。这意味着在编译时 tuple< x as int> tuple< y as int> 不同类型,而在运行时它们将被视为相同类型。这将导致许多问题,如平等和类型身份。这是一个抽象为我的口味太渗漏。



其他可能的选项是使用状态包对象。然而,使用状态包将破坏对语言中的类型构造器的支持的整个目的。这里的想法是允许自定义语言扩展在编译时生成新的类型,编译器可以使用。进行静态类型检查。



在Java中,使用自定义类加载器。基本上,使用元组类型的代码可以在不实际定义磁盘上的类型的情况下发出。然后可以定义定制的类加载器,其将在运行时动态地生成元组类型。这将允许编译器中的静态类型检查,并且将统一元组类型跨编译边界。



然而,不幸的是,CLR不支持自定义类加载。 CLR中的所有加载都在汇编级完成。可以为每个构造类型定义一个单独的程序集,但是这将很快导致性能问题(有许多程序集,只有一个类型在其中将使用太多的资源)。



所以,我想知道的是:



NET,在那里我可以发出对一个不存在的类型的引用,然后在运行时动态生成对该类型的引用,需要使用它运行之前的代码?



注意:



*我实际上已经知道问题的答案,我在下面提供答案。然而,它花了我大约3天的研究,和相当多的IL黑客,以提出一个解决方案。我想在这里记录它是一个好主意,以防任何人遇到相同的问题。 *

解决方案

答案是肯定的,但解决方案有点棘手。



System.Reflection .Emit 命名空间定义允许动态生成程序集的类型。它们还允许以递增方式定义生成的程序集。换句话说,可以向动态组件添加类型,执行生成的代码,然后向组件添加更多类型。



System.AppDomain 类还定义了 AssemblyResolve 事件,当框架无法加载程序集时触发。通过为该事件添加处理程序,可以定义单个运行时程序集,其中放置所有构造类型。由使用构造类型的编译器生成的代码将引用运行时组装件中的类型。由于运行时程序集实际上不存在于磁盘上,因此 AssemblyResolve 事件将在编译代码第一次尝试访问构造类型时触发。事件的句柄然后将生成动态程序集并将其返回到CLR。



不幸的是,有一些棘手的点要让这个工作。第一个问题是确保在运行编译代码之前始终安装事件处理程序。使用控制台应用程序这很容易。连接事件处理程序的代码可以在其他代码运行之前添加到 Main 方法。然而,对于类库,没有主要方法。 dll可能被加载为用另一种语言编写的应用程序的一部分,所以不可能假设总有一个main方法可用于连接事件处理程序代码。



第二个问题是确保引用类型在使用引用它们的任何代码之前都被插入到动态组合件中。 System.AppDomain 类还定义了 TypeResolve 事件,当CLR无法解析动态装配中的类型时执行。它使事件处理程序有机会在使用它的代码运行之前定义动态程序集中的类型。但是,在这种情况下,该事件将无法工作。即使引用的程序集是动态定义的,CLR也不会触发由其他程序集静态引用的程序集的事件。这意味着我们需要一种方法来在编译程序集中运行任何其他代码之前运行代码,并且如果它们尚未被定义,它将需要的类型动态插入到运行时程序集中。否则当CLR试图加载这些类型时,它会注意到动态程序集不包含他们需要的类型,并且会抛出一个类型加载异常。



幸运的是,CLR提供了两个问题的解决方案:模块初始化程序。模块初始化程序相当于静态类构造函数,除了它初始化整个模块,而不仅仅是一个类。 BaiCally,CLR将:


  1. 在访问模块中的任何类型之前运行模块构造函数。

  2. 确保只有那些由模块构造函数直接访问的类型才会在执行时加载

  3. 不允许模块外部的代码访问任何成员,直到构造函数完成。

它对所有程序集都有效,包括类库和可执行文件,EXE将在执行Main方法之前运行模块构造函数



查看此

无论如何,对我的问题的完整解决方案需要几个部分:


  1. 下面的类定义在语言运行时dll中定义,由编译器生成的所有程序集引用(这是C#代码) 。

     使用System; 
    using System.Collections.Generic;
    using System.Reflection;
    using System.Reflection.Emit;

    命名空间SharedLib
    {
    public class Loader
    {
    private Loader(ModuleBuilder dynamicModule)
    {
    m_dynamicModule = dynamicModule;
    m_definedTypes = new HashSet< string>();
    }

    private static readonly Loader m_instance;
    private readonly ModuleBuilder m_dynamicModule;
    private readonly HashSet< string> m_definedTypes;

    static Loader()
    {
    var name = new AssemblyName($ Runtime);
    var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(name,AssemblyBuilderAccess.Run);
    var module = assemblyBuilder.DefineDynamicModule($ Runtime);
    m_instance = new Loader(module);
    AppDomain.CurrentDomain.AssemblyResolve + = new ResolveEventHandler(CurrentDomain_AssemblyResolve);
    }

    static Assembly CurrentDomain_AssemblyResolve(object sender,ResolveEventArgs args)
    {
    if(args.Name == Instance.m_dynamicModule.Assembly.FullName)
    {
    return Instance.m_dynamicModule.Assembly;
    }
    else
    {
    return null;
    }
    }

    public static Loader实例
    {
    get
    {
    return m_instance;
    }
    }

    public bool IsDefined(string name)
    {
    return m_definedTypes.Contains(name);
    }

    public TypeBuilder DefineType(string name)
    {
    //在实际系统中,我们不会公开类型生成器。
    //而是类型的AST将被传入,我们将创建它。
    var type = m_dynamicModule.DefineType(name,TypeAttributes.Public);
    m_definedTypes.Add(name);
    return type;
    }
    }
    }

    类定义一个单例包含对将在其中创建构造的类型的动态组装的引用。它还具有存储已经动态生成的类型集合的哈希集合,并且最后定义可用于定义类型的成员。此示例仅返回一个System.Reflection.Emit.TypeBuilder实例,然后可以用于定义正在生成的类。在实际的系统中,方法可能会接受类的AST表示,并且只是生成它自己。


  2. 编译的程序集,两个引用(以ILASM语法显示):

      .assembly extern $ Runtime 
    {
    .ver 0 :0:0:0
    }
    .assembly extern SharedLib
    {
    .ver 1:0:0:0
    }

    这里SharedLib是语言的预定义运行时库,包括上面定义的Loader类,$ Runtime


  3. 在该语言中编译的每个程序集中的模块构造函数。



    据我所知,没有.NET语言允许在源代码中定义模块构造函数。 C ++ / CLI编译器是我知道的唯一编译器生成它们。在IL中,它们看起来像这样,直接在模块中定义,不在任何类型定义内:

      .method privatescope specialname rtspecialname static 
    void .cctor()cil managed
    {
    //此处生成任何构造的类型...
    }

    对我来说,这不是一个问题,我必须编写自定义的IL,让这个工作。我在编写一个编译器,所以代码生成不是一个问题。



    在使用 tuple< i as int,j as int> tuple< x as double,y as double,z as double> 模块构造函数需要生成类型以下(在C#语法中):

      class Tuple_i_j< T,R& 
    {
    public T i;
    public R j;
    }

    class Tuple_x_y_z< T,R,S>
    {
    public T x;
    public R y;
    public S z;
    }

    元组类作为通用类型生成以解决辅助功能问题。这将允许编译程序集中的代码使用 tuple< x as Foo> ,其中Foo是一些非公开类型。


    $ b b

    这样做的模块构造函数的主体(这里只显示一种类型,用C#语法编写)看起来像这样:

     code> var loader = SharedLib.Loader.Instance; 
    lock(loader)
    {
    if(!loader.IsDefined($ Tuple_i_j))
    {
    //创建类型。
    var Tuple_i_j = loader.DefineType($ Tuple_i_j);
    //定义通用参数< T,R>
    var genericParams = Tuple_i_j.DefineGenericParameters(T,R);
    var T = genericParams [0];
    var R = genericParams [1];
    //定义字段i
    var fieldX = Tuple_i_j.DefineField(i,T,FieldAttributes.Public);
    //定义字段j
    var fieldY = Tuple_i_j.DefineField(j,R,FieldAttributes.Public);
    //创建默认构造函数。
    var constructor = Tuple_i_j.DefineDefaultConstructor(MethodAttributes.Public);

    //关闭类型,以便可以通过执行代码使用它。
    Tuple_i_j.CreateType();
    }
    }


所以在任何情况下,这是我能够提出的机制,使能够在CLR中实现自定义类加载器的粗略等效。



有人知道更容易的方法吗?


Does anyone know if it possible to define the equivalent of a "java custom class loader" in .NET?

To give a little background:

I am in the process of developing a new programming language that targets the CLR, called "Liberty". One of the features of the language is its ability to define "type constructors", which are methods that are executed by the compiler at compile time and generate types as output. They are sort of a generalization of generics (the language does have normal generics in it), and allow code like this to be written (in "Liberty" syntax):

var t as tuple<i as int, j as int, k as int>;
t.i = 2;
t.j = 4;
t.k = 5;

Where "tuple" is defined like so:

public type tuple(params variables as VariableDeclaration[]) as TypeDeclaration
{
   //...
}

In this particular example, the type constructor tuple provides something similar to anonymous types in VB and C#.

However, unlike anonymous types, "tuples" have names and can be used inside public method signatures.

This means that I need a way for the type that eventually ends up being emitted by the compiler to be shareable across multiple assemblies. For example, I want

tuple<x as int> defined in Assembly A to end up being the same type as tuple<x as int> defined in Assembly B.

The problem with this, of course, is that Assembly A and Assembly B are going to be compiled at different times, which means they would both end up emitting their own incompatible versions of the tuple type.

I looked into using some sort of "type erasure" to do this, so that I would have a shared library with a bunch of types like this (this is "Liberty" syntax):

class tuple<T>
{
    public Field1 as T;
}

class tuple<T, R>
{
    public Field2 as T;
    public Field2 as R;
}

and then just redirect access from the i, j, and k tuple fields to Field1, Field2, and Field3.

However that is not really a viable option. This would mean that at compile time tuple<x as int> and tuple<y as int> would end up being different types, while at runtime time they would be treated as the same type. That would cause many problems for things like equality and type identity. That is too leaky of an abstraction for my tastes.

Other possible options would be to use "state bag objects". However, using a state bag would defeat the whole purpose of having support for "type constructors" in the language. The idea there is to enable "custom language extensions" to generate new types at compile time that the compiler can do static type checking with.

In Java, this could be done using custom class loaders. Basically the code that uses tuple types could be emitted without actually defining the type on disk. A custom "class loader" could then be defined that would dynamically generate the tuple type at runtime. That would allow static type checking inside the compiler, and would unify the tuple types across compilation boundaries.

Unfortunately, however, the CLR does not provide support for custom class loading. All loading in the CLR is done at the assembly level. It would be possible to define a separate assembly for each "constructed type", but that would very quickly lead to performance problems (having many assemblies with only one type in them would use too many resources).

So, what I want to know is:

Is it possible to simulate something like Java Class Loaders in .NET, where I can emit a reference to a non-existing type in and then dynamically generate a reference to that type at runtime before the code the needs to use it runs?

NOTE:

*I actually already know the answer to the question, which I provide as an answer below. However, it took me about 3 days of research, and quite a bit of IL hacking in order to come up with a solution. I figured it would be a good idea to document it here in case anyone else ran into the same problem. *

解决方案

The answer is yes, but the solution is a little tricky.

The System.Reflection.Emit namespace defines types that allows assemblies to be generated dynamically. They also allow the generated assemblies to be defined incrementally. In other words it is possible to add types to the dynamic assembly, execute the generated code, and then latter add more types to the assembly.

The System.AppDomain class also defines an AssemblyResolve event that fires whenever the framework fails to load an assembly. By adding a handler for that event, it is possible to define a single "runtime" assembly into which all "constructed" types are placed. The code generated by the compiler that uses a constructed type would refer to a type in the runtime assembly. Because the runtime assembly doesn't actually exist on disk, the AssemblyResolve event would be fired the first time the compiled code tried to access a constructed type. The handle for the event would then generate the dynamic assembly and return it to the CLR.

Unfortunately, there are a few tricky points to getting this to work. The first problem is ensuring that the event handler will always be installed before the compiled code is run. With a console application this is easy. The code to hookup the event handler can just be added to the Main method before the other code runs. For class libraries, however, there is no main method. A dll may be loaded as part of an application written in another language, so it's not really possible to assume there is always a main method available to hookup the event handler code.

The second problem is ensuring that the referenced types all get inserted into the dynamic assembly before any code that references them is used. The System.AppDomain class also defines a TypeResolve event that is executed whenever the CLR is unable to resolve a type in a dynamic assembly. It gives the event handler the opportunity to define the type inside the dynamic assembly before the code that uses it runs. However, that event will not work in this case. The CLR will not fire the event for assemblies that are "statically referenced" by other assemblies, even if the referenced assembly is defined dynamically. This means that we need a way to run code before any other code in the compiled assembly runs and have it dynamically inject the types it needs into the runtime assembly if they have not already been defined. Otherwise when the CLR tried to load those types it will notice that the dynamic assembly does not contain the types they need and will throw a type load exception.

Fortunately, the CLR offers a solution to both problems: Module Initializers. A module initializer is the equivalent of a "static class constructor", except that it initializes an entire module, not just a single class. Baiscally, the CLR will:

  1. Run the module constructor before any types inside the module are accessed.
  2. Guarantee that only those types directly accessed by the module constructor will be loaded while it is executing
  3. Not allow code outside the module to access any of it's members until after the constructor has finished.

It does this for all assemblies, including both class libraries and executables, and for EXEs will run the module constructor before executing the Main method.

See this blog post for more information about constructors.

In any case, a complete solution to my problem requires several pieces:

  1. The following class definition, defined inside a "language runtime dll", that is referenced by all assemblies produced by the compiler (this is C# code).

    using System;
    using System.Collections.Generic;
    using System.Reflection;
    using System.Reflection.Emit;
    
    namespace SharedLib
    {
        public class Loader
        {
            private Loader(ModuleBuilder dynamicModule)
            {
                m_dynamicModule = dynamicModule;
                m_definedTypes = new HashSet<string>();
            }
    
            private static readonly Loader m_instance;
            private readonly ModuleBuilder m_dynamicModule;
            private readonly HashSet<string> m_definedTypes;
    
            static Loader()
            {
                var name = new AssemblyName("$Runtime");
                var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run);
                var module = assemblyBuilder.DefineDynamicModule("$Runtime");
                m_instance = new Loader(module);
                AppDomain.CurrentDomain.AssemblyResolve += new ResolveEventHandler(CurrentDomain_AssemblyResolve);
            }
    
            static Assembly CurrentDomain_AssemblyResolve(object sender, ResolveEventArgs args)
            {
                if (args.Name == Instance.m_dynamicModule.Assembly.FullName)
                {
                    return Instance.m_dynamicModule.Assembly;
                }
                else
                {
                    return null;
                }
            }
    
            public static Loader Instance
            {
                get
                {
                    return m_instance;
                }
            }
    
            public bool IsDefined(string name)
            {
                return m_definedTypes.Contains(name);
            }
    
            public TypeBuilder DefineType(string name)
            {
                //in a real system we would not expose the type builder.
                //instead a AST for the type would be passed in, and we would just create it.
                var type = m_dynamicModule.DefineType(name, TypeAttributes.Public);
                m_definedTypes.Add(name);
                return type;
            }
        }
    }
    

    The class defines a singleton that holds a reference to the dynamic assembly that the constructed types will be created in. It also holds a "hash set" that stores the set of types that have already been dynamically generated, and finally defines a member that can be used to define the type. This example just returns a System.Reflection.Emit.TypeBuilder instance that can then be used to define the class being generated. In a real system, the method would probably take in an AST representation of the class, and just do the generation it's self.

  2. Compiled assemblies that emit the following two references (shown in ILASM syntax):

    .assembly extern $Runtime
    {
        .ver 0:0:0:0
    }
    .assembly extern SharedLib
    {
        .ver 1:0:0:0
    }
    

    Here "SharedLib" is the Language's predefined runtime library that includes the "Loader" class defined above and "$Runtime" is the dynamic runtime assembly that the consructed types will be inserted into.

  3. A "module constructor" inside every assembly compiled in the language.

    As far as I know, there are no .NET languages that allow Module Constructors to be defined in source. The C++ /CLI compiler is the only compiler I know of that generates them. In IL, they look like this, defined directly in the module and not inside any type definitions:

    .method privatescope specialname rtspecialname static 
            void  .cctor() cil managed
    {
        //generate any constructed types dynamically here...
    }
    

    For me, It's not a problem that I have to write custom IL to get this to work. I'm writing a compiler, so code generation is not an issue.

    In the case of an assembly that used the types tuple<i as int, j as int> and tuple<x as double, y as double, z as double> the module constructor would need to generate types like the following (here in C# syntax):

    class Tuple_i_j<T, R>
    {
        public T i;
        public R j;
    }
    
    class Tuple_x_y_z<T, R, S>
    {
        public T x;
        public R y;
        public S z;
    }
    

    The tuple classes are generated as generic types to get around accessibility issues. That would allow code in the compiled assembly to use tuple<x as Foo>, where Foo was some non-public type.

    The body of the module constructor that did this (here only showing one type, and written in C# syntax) would look like this:

    var loader = SharedLib.Loader.Instance;
    lock (loader)
    {
        if (! loader.IsDefined("$Tuple_i_j"))
        {
            //create the type.
            var Tuple_i_j = loader.DefineType("$Tuple_i_j");
            //define the generic parameters <T,R>
           var genericParams = Tuple_i_j.DefineGenericParameters("T", "R");
           var T = genericParams[0];
           var R = genericParams[1];
           //define the field i
           var fieldX = Tuple_i_j.DefineField("i", T, FieldAttributes.Public);
           //define the field j
           var fieldY = Tuple_i_j.DefineField("j", R, FieldAttributes.Public);
           //create the default constructor.
           var constructor= Tuple_i_j.DefineDefaultConstructor(MethodAttributes.Public);
    
           //"close" the type so that it can be used by executing code.
           Tuple_i_j.CreateType();
        }
    }
    

So in any case, this was the mechanism I was able to come up with to enable the rough equivalent of custom class loaders in the CLR.

Does anyone know of an easier way to do this?

这篇关于.NET中类装载器的等效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆