对象指纹识别:序列化+不可触摸的旧代码+仅限Getter的自动属性=陷入困境? [英] Object fingerprinting: serialization + untouchable legacy code + Getter-only auto-properties = cornered?

查看:91
本文介绍了对象指纹识别:序列化+不可触摸的旧代码+仅限Getter的自动属性=陷入困境?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现自己已陷入困境,所以我们开始吧.

I have found myself cornered, so here we go.

我需要生成指纹哈希码以进行对象区分.比较两组对象的哈希值,需要告诉我是否存在具有相同哈希值的相同对象.

I need to produce a fingerprint hash code for object diffing. Comparing the hashes of two sets of objects will need to tell me if there are identical objects with the same hash.

指纹哈希必须与平台无关.所以我去了 MD5哈希.

我正在使用无法控制的大型对象模型代码库.我无法修改为此指纹将通过的所有类型.我无法添加属性或构造函数或修改任何内容.这不排除类型将来会更改.因此,任何方法都必须是编程的-我不能仅仅创建一个Surrogate类来避免该问题;至少不是手动的.

I am working with a large Object model code base that is out of my control. All types that I will be passed for this fingerprinting can not be modified by me. I cannot add attribute or constructors or modify anything. That does not exclude that the types will change in the future. So any approach must be programmatic -- I cannot just create a Surrogate class to avoid the problem; at least, not manually.

但是,性能并不是问题,因此反射具有完全的绿灯.

However, performance is not a concern, so reflection has complete green-light.

此外,我将需要能够控制哈希中属性的排除.如果我排除某个属性,则两个具有所有相同属性的对象,只是一个对象仍需要获取相同的哈希值.

In addition, I will need to be able to control the exclusion of properties from the hashing. If I exclude a certain property, two object that have all the properties identical to each other except that one will still need to get the same hash.

MD5哈希要求将对象序列化为Byte [].

MD5 hashing requires the object to be Serialised in Byte[].

序列化要求将该类标记为[Serializable].我无法将其添加到旧版代码中,因此很自然地也无法在运行时添加.

The serialisation requires the class to be marked as [Serializable]. Which I cannot add to the legacy code, and naturally it can not be added at runtime either.

所以我去了 protobuf-net .

So I went for protobuf-net.

Protobuf 遇到实现带有仅具有Getter的自动属性的接口的类型时,正确地失败:

Protobuf rightly fails when encountering types that implement an interface with Getter-only auto-properties:

public interface ISomeInterface
{
        double Vpy { get; }
        double Vy { get; }
        double Vpz { get; }
        ...
}

要使用代理人通过多种类型实现此接口,这似乎也是不可行的(不切实际,不可维护).

Being this Interface implemented by many types, using Surrogates seems also a no-go (impractical, non maintainable).

我只需要序列化而不是反序列化,因此我不明白为什么在这种情况下protobuf-net的局限性. 我知道protobuf-net不能在需要时往返,但我不需要往返

I would just need to serialize, not to deserialize, so I don't see why the limitation of protobuf-net in this case. I understand protobuf-net would not be able to round-trip if needed, but I don't need to round-trip!

我真的弯腰了吗? 还有其他选择吗?

Am I really cornered? Is there any alternative?

正如我说的那样,这非常有效,但前提是对象没有任何类型(或嵌套属性)的类型(该类型具有仅具有Getter的自动属性).

As I said, this works perfectly but only if the objects do not have any property (or nested property) that is a type with a Getter-only auto property.

public static byte[] ToByteArray(this object obj, List<PropertyInfo> exclusionsProps = null)
{
    if (exclusionsProps == null)
        exclusionsProps = new List<PropertyInfo>();

    // Protobuf-net implementation
    ProtoBuf.Meta.RuntimeTypeModel model = ProtoBuf.Meta.TypeModel.Create();

    AddPropsToModel(model, obj.GetType(), exclusionsProps);

    byte[] bytes;
    using (var memoryStream = new MemoryStream())
    {
        model.Serialize(memoryStream, obj);
        bytes = memoryStream.GetBuffer();
    }

    return bytes;
}

public static void AddPropsToModel(ProtoBuf.Meta.RuntimeTypeModel model, Type objType, List<PropertyInfo> exclusionsProps = null)
{
    List<PropertyInfo> props = new List<PropertyInfo>();

    if (exclusionsProps != null)
        props.RemoveAll(pr => exclusionsProps.Exists(t => t.DeclaringType == pr.DeclaringType && t.Name == pr.Name));

    props
        .Where(prop => prop.PropertyType.IsClass || prop.PropertyType.IsInterface).ToList()
        .ForEach(prop =>
        {
            AddPropsToModel(model, prop.PropertyType, exclusionsProps); //recursive call
        }
        );

    var propsNames = props.Select(p => p.Name).OrderBy(name => name).ToList();

    model.Add(objType, true).Add(propsNames.ToArray());
}

然后我将这样使用:

  foreach (var obj in objs)
            {
                byte[] objByte = obj.ToByteArray(exclusionTypes);

                using (MD5 md5Hash = MD5.Create())
                {
                    string hash = GetMd5Hash(md5Hash, objByte);
                    Console.WriteLine(obj.GetType().Name + ": " + hash);
                }
            }

推荐答案

这里的简单解决方案是完全避开问题的根本原因.

The simple solution here is to completely sidestep the root cause of your issue.

当您无法修改现有的类,但是需要对其进行一些修改时,最简单的方法是创建一个新的和改进的子类,在其中您可以进行所需的修改.

When you can't modify the existing classes, but you need some modifications for them, the easiest way to do that is to create a new and improved subclass, where the modifications you require are available.

考虑到传统代码库显然将在您的控制范围之外进行更改,处理这些更改的唯一方法是在运行时生成这些类型.幸运的是,C#允许您发出可以完全解决此问题的中间语言.

Considering that the legacy codebase apparently will change outside of your control, the only way to deal with these changes is to generate these types at runtime. Luckily C# allows you to emit intermediate language which can solve exactly this problem.

您将从 DefineType方法可从ModuleBuilder获得.具体来说,您想使用带有String,TypeAttributes和Type(代表您扩展的类)的重载

You'd start with the DefineType method available from the ModuleBuilder class. Specifically you want to use the overload taking a String, TypeAttributes and a Type (representing the class you extend)

这篇关于对象指纹识别:序列化+不可触摸的旧代码+仅限Getter的自动属性=陷入困境?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆