.NET 中类加载器的等价物 [英] Equivalent of Class Loaders in .NET

查看:34
本文介绍了.NET 中类加载器的等价物的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道是否可以在 .NET 中定义等效的java 自定义类加载器"?

Does anyone know if it possible to define the equivalent of a "java custom class loader" in .NET?

提供一点背景:

我正在开发一种以 CLR 为目标的新编程语言,称为Liberty".该语言的特点之一是它能够定义类型构造函数",这是由编译器在编译时执行并生成类型作为输出的方法.它们是泛型的一种概括(该语言中确实有普通的泛型),并允许编写这样的代码(以Liberty"语法):

I am in the process of developing a new programming language that targets the CLR, called "Liberty". One of the features of the language is its ability to define "type constructors", which are methods that are executed by the compiler at compile time and generate types as output. They are sort of a generalization of generics (the language does have normal generics in it), and allow code like this to be written (in "Liberty" syntax):

var t as tuple<i as int, j as int, k as int>;
t.i = 2;
t.j = 4;
t.k = 5;

元组"的定义如下:

public type tuple(params variables as VariableDeclaration[]) as TypeDeclaration
{
   //...
}

在这个特定示例中,类型构造函数 tuple 提供了类似于 VB 和 C# 中的匿名类型的功能.

In this particular example, the type constructor tuple provides something similar to anonymous types in VB and C#.

但是,与匿名类型不同,元组"具有名称并且可以在公共方法签名中使用.

However, unlike anonymous types, "tuples" have names and can be used inside public method signatures.

这意味着我需要一种方法,让最终由编译器发出的类型可以在多个程序集中共享.比如我想要

This means that I need a way for the type that eventually ends up being emitted by the compiler to be shareable across multiple assemblies. For example, I want

tuple 在程序集 A 中定义,最终与在程序集 B 中定义的 tuple 类型相同.

tuple<x as int> defined in Assembly A to end up being the same type as tuple<x as int> defined in Assembly B.

当然,这样做的问题是程序集 A 和程序集 B 将在不同的时间编译,这意味着它们最终都会发出自己不兼容的元组类型版本.

The problem with this, of course, is that Assembly A and Assembly B are going to be compiled at different times, which means they would both end up emitting their own incompatible versions of the tuple type.

我研究过使用某种类型擦除"来做到这一点,这样我就可以拥有一个包含一堆这样的类型的共享库(这是Liberty"语法):

I looked into using some sort of "type erasure" to do this, so that I would have a shared library with a bunch of types like this (this is "Liberty" syntax):

class tuple<T>
{
    public Field1 as T;
}

class tuple<T, R>
{
    public Field2 as T;
    public Field2 as R;
}

然后将访问从 i、j 和 k 元组字段重定向到 Field1Field2Field3.

and then just redirect access from the i, j, and k tuple fields to Field1, Field2, and Field3.

但是,这并不是一个真正可行的选择.这意味着在编译时 tupletuple 最终会成为不同的类型,而在运行时它们将被视为同一类型.这会给诸如平等和类型标识之类的事情带来很多问题.这对我的口味来说太抽象了.

However that is not really a viable option. This would mean that at compile time tuple<x as int> and tuple<y as int> would end up being different types, while at runtime time they would be treated as the same type. That would cause many problems for things like equality and type identity. That is too leaky of an abstraction for my tastes.

其他可能的选择是使用状态包对象".但是,使用状态包会破坏语言中支持类型构造函数"的全部目的.想法是启用自定义语言扩展"以在编译时生成新类型,编译器可以使用这些新类型进行静态类型检查.

Other possible options would be to use "state bag objects". However, using a state bag would defeat the whole purpose of having support for "type constructors" in the language. The idea there is to enable "custom language extensions" to generate new types at compile time that the compiler can do static type checking with.

在 Java 中,这可以使用自定义类加载器来完成.基本上可以发出使用元组类型的代码,而无需在磁盘上实际定义类型.然后可以定义一个自定义的类加载器",它将在运行时动态生成元组类型.这将允许在编译器内部进行静态类型检查,并将跨编译边界统一元组类型.

In Java, this could be done using custom class loaders. Basically the code that uses tuple types could be emitted without actually defining the type on disk. A custom "class loader" could then be defined that would dynamically generate the tuple type at runtime. That would allow static type checking inside the compiler, and would unify the tuple types across compilation boundaries.

不幸的是,CLR 不支持自定义类加载.CLR 中的所有加载都是在程序集级别完成的.可以为每个构造类型"定义一个单独的程序集,但这会很快导致性能问题(拥有许多只有一种类型的程序集会占用太多资源).

Unfortunately, however, the CLR does not provide support for custom class loading. All loading in the CLR is done at the assembly level. It would be possible to define a separate assembly for each "constructed type", but that would very quickly lead to performance problems (having many assemblies with only one type in them would use too many resources).

那么,我想知道的是:

是否可以在 .NET 中模拟 Java 类加载器之类的东西,我可以在其中发出对不存在类型的引用,然后在需要使用它的代码运行之前在运行时动态生成对该类型的引用?

Is it possible to simulate something like Java Class Loaders in .NET, where I can emit a reference to a non-existing type in and then dynamically generate a reference to that type at runtime before the code the needs to use it runs?

注意:

*我实际上已经知道这个问题的答案,我在下面提供了一个答案.然而,我花了大约 3 天的时间进行研究,并进行了大量的 IL 黑客攻击,以便提出解决方案.我认为在这里记录它是一个好主意,以防其他人遇到同样的问题.*

*I actually already know the answer to the question, which I provide as an answer below. However, it took me about 3 days of research, and quite a bit of IL hacking in order to come up with a solution. I figured it would be a good idea to document it here in case anyone else ran into the same problem. *

推荐答案

答案是肯定的,但是解决方法有点棘手.

The answer is yes, but the solution is a little tricky.

System.Reflection.Emit 命名空间定义了允许动态生成程序集的类型.它们还允许增量定义生成的程序集.换句话说,可以向动态程序集添加类型,执行生成的代码,然后再向程序集添加更多类型.

The System.Reflection.Emit namespace defines types that allows assemblies to be generated dynamically. They also allow the generated assemblies to be defined incrementally. In other words it is possible to add types to the dynamic assembly, execute the generated code, and then latter add more types to the assembly.

System.AppDomain 类还定义了一个 AssemblyResolve 事件,只要框架无法加载程序集.通过为该事件添加处理程序,可以定义一个运行时"程序集,其中放置所有构造"类型.使用构造类型的编译器生成的代码将引用运行时程序集中的类型.因为运行时程序集实际上并不存在于磁盘上,所以 AssemblyResolve 事件将在编译代码第一次尝试访问构造类型时触发.然后,事件的句柄将生成动态程序集并将其返回给 CLR.​​

The System.AppDomain class also defines an AssemblyResolve event that fires whenever the framework fails to load an assembly. By adding a handler for that event, it is possible to define a single "runtime" assembly into which all "constructed" types are placed. The code generated by the compiler that uses a constructed type would refer to a type in the runtime assembly. Because the runtime assembly doesn't actually exist on disk, the AssemblyResolve event would be fired the first time the compiled code tried to access a constructed type. The handle for the event would then generate the dynamic assembly and return it to the CLR.

不幸的是,要让它发挥作用有一些棘手的问题.第一个问题是确保在运行编译代码之前始终安装事件处理程序.使用控制台应用程序,这很容易.连接事件处理程序的代码可以在其他代码运行之前添加到 Main 方法中.然而,对于类库,没有 main 方法.一个 dll 可以作为用另一种语言编写的应用程序的一部分加载,因此不可能假设总是有一个 main 方法可用于连接事件处理程序代码.

Unfortunately, there are a few tricky points to getting this to work. The first problem is ensuring that the event handler will always be installed before the compiled code is run. With a console application this is easy. The code to hookup the event handler can just be added to the Main method before the other code runs. For class libraries, however, there is no main method. A dll may be loaded as part of an application written in another language, so it's not really possible to assume there is always a main method available to hookup the event handler code.

第二个问题是确保在使用引用它们的任何代码之前将所有引用的类型都插入到动态程序集中.System.AppDomain 类也定义 TypeResolve 事件只要 CLR 无法解析动态程序集中的类型,就会执行该操作.它使事件处理程序有机会在使用它的代码运行之前在动态程序集中定义类型.但是,该事件在这种情况下不起作用.CLR 不会为其他程序集静态引用"的程序集触发事件,即使引用的程序集是动态定义的.这意味着我们需要一种在已编译程序集中的任何其他代码运行之前运行代码的方法,并让它动态地将所需的类型注入运行时程序集中(如果它们尚未定义).否则,当 CLR 尝试加载这些类型时,它会注意到动态程序集不包含它们需要的类型,并会引发类型加载异常.

The second problem is ensuring that the referenced types all get inserted into the dynamic assembly before any code that references them is used. The System.AppDomain class also defines a TypeResolve event that is executed whenever the CLR is unable to resolve a type in a dynamic assembly. It gives the event handler the opportunity to define the type inside the dynamic assembly before the code that uses it runs. However, that event will not work in this case. The CLR will not fire the event for assemblies that are "statically referenced" by other assemblies, even if the referenced assembly is defined dynamically. This means that we need a way to run code before any other code in the compiled assembly runs and have it dynamically inject the types it needs into the runtime assembly if they have not already been defined. Otherwise when the CLR tried to load those types it will notice that the dynamic assembly does not contain the types they need and will throw a type load exception.

幸运的是,CLR 为这两个问题提供了解决方案:模块初始化程序.模块初始化器相当于静态类构造器",除了它初始化整个模块,而不仅仅是单个类.糟糕的是,CLR 将:

Fortunately, the CLR offers a solution to both problems: Module Initializers. A module initializer is the equivalent of a "static class constructor", except that it initializes an entire module, not just a single class. Baiscally, the CLR will:

  1. 在访问模块内的任何类型之前运行模块构造函数.
  2. 保证只有那些被模块构造函数直接访问的类型在执行时才会被加载
  3. 在构造函数完成之前,不允许模块外部的代码访问它的任何成员.

它对所有程序集执行此操作,包括类库和可执行文件,对于 EXE,将在执行 Main 方法之前运行模块构造函数.

It does this for all assemblies, including both class libraries and executables, and for EXEs will run the module constructor before executing the Main method.

请参阅此博文了解更多信息构造函数.

See this blog post for more information about constructors.

无论如何,我的问题的完整解决方案需要几个部分:

In any case, a complete solution to my problem requires several pieces:

  1. 以下类定义,在语言运行时 dll"中定义,编译器生成的所有程序集(这是 C# 代码)都引用该类定义.

  1. The following class definition, defined inside a "language runtime dll", that is referenced by all assemblies produced by the compiler (this is C# code).

using System;
using System.Collections.Generic;
using System.Reflection;
using System.Reflection.Emit;

namespace SharedLib
{
    public class Loader
    {
        private Loader(ModuleBuilder dynamicModule)
        {
            m_dynamicModule = dynamicModule;
            m_definedTypes = new HashSet<string>();
        }

        private static readonly Loader m_instance;
        private readonly ModuleBuilder m_dynamicModule;
        private readonly HashSet<string> m_definedTypes;

        static Loader()
        {
            var name = new AssemblyName("$Runtime");
            var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run);
            var module = assemblyBuilder.DefineDynamicModule("$Runtime");
            m_instance = new Loader(module);
            AppDomain.CurrentDomain.AssemblyResolve += new ResolveEventHandler(CurrentDomain_AssemblyResolve);
        }

        static Assembly CurrentDomain_AssemblyResolve(object sender, ResolveEventArgs args)
        {
            if (args.Name == Instance.m_dynamicModule.Assembly.FullName)
            {
                return Instance.m_dynamicModule.Assembly;
            }
            else
            {
                return null;
            }
        }

        public static Loader Instance
        {
            get
            {
                return m_instance;
            }
        }

        public bool IsDefined(string name)
        {
            return m_definedTypes.Contains(name);
        }

        public TypeBuilder DefineType(string name)
        {
            //in a real system we would not expose the type builder.
            //instead a AST for the type would be passed in, and we would just create it.
            var type = m_dynamicModule.DefineType(name, TypeAttributes.Public);
            m_definedTypes.Add(name);
            return type;
        }
    }
}

该类定义了一个单例,其中包含对将在其中创建构造类型的动态程序集的引用.它还包含一个哈希集",用于存储已经动态生成的类型集,最后定义了一个可用于定义类型的成员.此示例仅返回一个 System.Reflection.Emit.TypeBuilder 实例,然后可以使用该实例来定义正在生成的类.在实际系统中,该方法可能会采用类的 AST 表示,并自行生成.

The class defines a singleton that holds a reference to the dynamic assembly that the constructed types will be created in. It also holds a "hash set" that stores the set of types that have already been dynamically generated, and finally defines a member that can be used to define the type. This example just returns a System.Reflection.Emit.TypeBuilder instance that can then be used to define the class being generated. In a real system, the method would probably take in an AST representation of the class, and just do the generation it's self.

发出以下两个引用的编译程序集(以 ILASM 语法显示):

Compiled assemblies that emit the following two references (shown in ILASM syntax):

.assembly extern $Runtime
{
    .ver 0:0:0:0
}
.assembly extern SharedLib
{
    .ver 1:0:0:0
}

这里的SharedLib"是语言的预定义运行时库,包括上面定义的Loader"类,$Runtime"是构造类型将插入到的动态运行时程序集.

Here "SharedLib" is the Language's predefined runtime library that includes the "Loader" class defined above and "$Runtime" is the dynamic runtime assembly that the consructed types will be inserted into.

用该语言编译的每个程序集中都有一个模块构造函数".

A "module constructor" inside every assembly compiled in the language.

据我所知,没有.NET 语言允许在源代码中定义模块构造函数.C++/CLI 编译器是我所知道的唯一生成它们的编译器.在 IL 中,它们看起来像这样,直接在模块中定义,而不是在任何类型定义中:

As far as I know, there are no .NET languages that allow Module Constructors to be defined in source. The C++ /CLI compiler is the only compiler I know of that generates them. In IL, they look like this, defined directly in the module and not inside any type definitions:

.method privatescope specialname rtspecialname static 
        void  .cctor() cil managed
{
    //generate any constructed types dynamically here...
}

对我来说,我必须编写自定义 IL 才能使其正常工作,这不是问题.我正在编写一个编译器,所以代码生成不是问题.

For me, It's not a problem that I have to write custom IL to get this to work. I'm writing a compiler, so code generation is not an issue.

如果程序集使用类型 tupletuple代码> 模块构造函数需要生成如下类型(此处为 C# 语法):

In the case of an assembly that used the types tuple<i as int, j as int> and tuple<x as double, y as double, z as double> the module constructor would need to generate types like the following (here in C# syntax):

class Tuple_i_j<T, R>
{
    public T i;
    public R j;
}

class Tuple_x_y_z<T, R, S>
{
    public T x;
    public R y;
    public S z;
}

元组类被生成为通用类型以解决可访问性问题.这将允许编译程序集中的代码使用 tuple,其中 Foo 是一些非公共类型.

The tuple classes are generated as generic types to get around accessibility issues. That would allow code in the compiled assembly to use tuple<x as Foo>, where Foo was some non-public type.

执行此操作的模块构造函数的主体(此处仅显示一种类型,并用 C# 语法编写)如下所示:

The body of the module constructor that did this (here only showing one type, and written in C# syntax) would look like this:

var loader = SharedLib.Loader.Instance;
lock (loader)
{
    if (! loader.IsDefined("$Tuple_i_j"))
    {
        //create the type.
        var Tuple_i_j = loader.DefineType("$Tuple_i_j");
        //define the generic parameters <T,R>
       var genericParams = Tuple_i_j.DefineGenericParameters("T", "R");
       var T = genericParams[0];
       var R = genericParams[1];
       //define the field i
       var fieldX = Tuple_i_j.DefineField("i", T, FieldAttributes.Public);
       //define the field j
       var fieldY = Tuple_i_j.DefineField("j", R, FieldAttributes.Public);
       //create the default constructor.
       var constructor= Tuple_i_j.DefineDefaultConstructor(MethodAttributes.Public);

       //"close" the type so that it can be used by executing code.
       Tuple_i_j.CreateType();
    }
}

所以无论如何,这是我能够想出的机制,以在 CLR 中启用大致等效的自定义类加载器.

So in any case, this was the mechanism I was able to come up with to enable the rough equivalent of custom class loaders in the CLR.

有人知道更简单的方法吗?

Does anyone know of an easier way to do this?

这篇关于.NET 中类加载器的等价物的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆