结构类型别名/无联合的标记联合 [英] Struct type aliasing / tagged-union without union

查看:98
本文介绍了结构类型别名/无联合的标记联合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于两个(或多个)struct:BaseSub,并且有一个共同的第一个(未命名)struct,从Base转换/广播到Sub是安全的,反之亦然?

For two (or more) structs: Base and Sub with a common first (unnamed) struct, is it safe to convert/cast from Base to Sub and vice versa?

struct Base{
    struct{
        int id;
        // ...
    };
    char data[]; // necessary?
}

struct Sub{
    struct{
        int id;
        // same '...'
    };
    // actual data
};

这些功能是否保证安全且技术上正确? (而且:零长char data[]成员是否必要且有用?)

Are these functions guaranteed to be safe and technically correct? (Also: is the zero-length char data[] member necessary and useful?)

struct Base * subToBase(struct Sub * s){
    return (struct Base*)s;
}

struct Sub * baseToSub(struct Base * b){
    if(b->id != SUB_ID){
        return NULL;
    }

    return (struct Sub*)b;
}

修改

我没有计划在Sub内嵌套除Base之外的任何内容,而是以后可以添加其他子类型(直接在Base下)而不需要更改Base的可能性.我主要关心的是指向struct的指针是否可以在Base和任何子元素之间安全地来回转换.最好参考(C11)标准.

I have no plans to nest any further than Base within Sub, but rather leave the possibility to add other sub-types (directly under Base) later without needing to change Base. My main concern is whether pointers to the structs can be safely converted back and forth between Base and any sub. References to the (C11) standard would be most appreciated.

编辑v2

稍微更改了措辞以阻止OOP/继承讨论.我想要的是一个没有union的标记联合,因此可以在以后进行扩展.我没有进行其他嵌套的计划.需要其他子类型功能的子类型可以显式地这样做,而无需进行任何进一步的嵌套.

Changed the wording slightly to discourage OOP/inheritance discussions. What I want is a tagged-union, without the union so it can be extended later. I have no plans for doing additional nesting. Sub-types that need other sub-types' functionality can do so explicitly, without doing any further nesting.

对于脚本解释器 1 面向对象的伪对象 tagged-union 类型系统,没有.它具有(抽象) 通用基本类型Object,具有几种(特定)子类型,例如StringList等.每个类型-struct都具有以下未命名的struct作为第一个成员:

For a script interpreter1 I have made a pseudo object-oriented tagged-union type system, without the union. It has an (abstract) generic base type Object with several (specific) sub-types, such as String, Number, List etc. Every type-struct has the following unnamed struct as the first member:

#define OBJHEAD struct{    \
    int id;                \
    int line;              \
    int column;            \
}

id标识对象的类型,linecolumn应该(也是)是不言自明的.各种对象的简化实现:

The id identifies the type of object, line and column should (also) be self-explanatory. A simplified implementation of various objects:

typedef struct Object{
    OBJHEAD;

    char data[]; // necessary?
} Object;

typedef struct Number{
    OBJHEAD;

    int value; // only int for simplicity
} Number;

typedef struct String{
    OBJHEAD;

    size_t length;
    char * string;
} String;

typedef struct List{
    OBJHEAD;

    size_t size;
    Object * elements; // may be any kind and mix of objects
} List;

Object * Number_toObject(Number * num){
    return (Object*)num;
}

Number * Number_fromObject(Object * obj){
    if(obj->type != TYPE_NUMBER){
        return NULL;
    }

    return (Number*)obj;
}

我知道,最优雅且技术上最正确的方法是将enum用于id,将union用于各种子类型.但是我希望类型系统是可扩展的(通过某种形式的类型注册),以便以后可以添加类型,而无需更改所有与Object相关的代码.

I know that the most elegant and technically correct way to do this would be to use an enum for the id and a union for the various sub-types. But I want the type-system to be extensible (through some form of type-registry) so that types can be added later without changing all the Object-related code.

以后/外部添加的内容可能是:

A later/external addition could be:

typedef struct File{
    OBJHEAD;

    FILE * fp;
} File;

无需更改Object.

这些转换是否一定是安全的?

Are these conversions guaranteed to be safe?

(关于小的宏滥用:OBJHEAD当然将得到广泛的文档记录,因此其他实现者将知道不使用哪些成员名称.这个想法不是隐藏标题,而是保存每个标题时间.)

(As for the small macro-abuse: the OBJHEAD will of course be extensively documented so additional implementers will know what member-names not to use. The idea is not to hide the header, but to save pasting it every time.)

推荐答案

允许将一种对象类型的指针转​​换为另一种对象类型的指针(例如,通过强制转换),但是如果结果指针不允许正确对齐,然后行为未定义(C11 6.3.2.3/7).取决于BaseSub的成员以及依赖于依赖行为的行为,未必正确地将转换为Sub *Base *对齐.例如,给定...

Converting a pointer to one object type to a pointer to a different object type (via a cast, for instance) is permitted, but if the resulting pointer is not correctly aligned then behavior is undefined (C11 6.3.2.3/7). Depending on the members of Base and Sub and on implentation-dependent behavior, it is not necessarily the case that a Base * converted to a Sub * is correctly aligned. For example, given ...

struct Base{
    struct{
        int id;
    };
    char data[]; // necessary?
}

struct Sub{
    struct{
        int id;
    };
    long long int value;
};

...可能是该实现允许Base对象在32位边界上对齐,但要求Sub对象在64位边界上或更严格的边界上对齐.

... it may be that the implementation permits Base objects to be aligned on 32-bit boundaries but requires Sub objects to be aligned on 64-bit boundaries, or even on stricter ones.

Base是否具有灵活的数组成员都不会影响

None of this is affected by whether Base has a flexible array member.

解引用通过转换其他类型的指针值获得的一种类型的指针值是否安全是一个不同的问题.一方面,C对实现选择布局结构的方式几乎没有限制:必须按照声明的顺序对成员进行布局,并且在第一个成员之前不得进行任何填充,否则,实现可以自由支配.据我所知,在您的情况下,如果两个成员中的多个匿名struct成员不止一个,则它们彼此必须以相同的方式进行布局. (并且,如果它们只有一个成员,那么为什么要使用匿名结构?)也不安全地假设Base.data从与Sub中匿名结构后面的第一个元素相同的偏移量开始.

It is a different question whether it is safe to dereference a pointer value of one type that was obtained by casting a pointer value of a different type. For one thing, C places rather few restrictions on how implementations choose to lay out structures: members must be laid out in the order they are declared, and there must not be any padding before the first one, but otherwise, implementations have free reign. To the best of my knowledge, in your case there is no requirement that the anonymous struct members of your two structures must be laid out the same way as each other if they have more than one member. (And if they have only one member then why use an anonumous struct?) It is also not safe to assume that Base.data starts at the same offset as the first element following the anonymous struct in Sub.

实际上,取消引用subToBase()的结果可能是可以的,并且您当然可以实施测试来验证这一点.另外,如果您具有通过Sub *转换而获得的Base *,则可以保证将其转换回原样(例如通过baseToSub()进行转换)的结果与原始Sub *相同(C11 6.3 .2.3/7).在那种情况下,转换为Base *并返回的方式对于将指针取消引用为Sub *的安全性没有影响.

In practice, dereferencing the result of your subToBase() is probably ok, and you can certainly implement tests to verify that. Also, if you have a Base * that was obtained by conversion from a Sub *, then the result of converting it back, for instance via baseToSub(), is guaranteed to be the same as the original Sub * (C11 6.3.2.3/7 again). In that case, the conversion to Base * and back has no effect on the safety of dereferencing the the pointer as a Sub *.

另一方面,尽管我在标准中找不到它的引用时遇到困难,但我不得不说baseToSub()在一般情况下是非常危险的.如果实际上未指向SubBase *转换为Sub *(其本身是允许的),则将该指针取消引用到未共享的访问成员是不安全的.使用Base.特别是,根据我在上面的声明,如果引用的对象实际上是Base,则声明Base.data绝不会阻止((Sub *)really_a_Base_ptr)->value产生未定义的行为.

On the other hand, though I'm having trouble finding a reference for it in the standard, I have to say that baseToSub() is very dangerous in the general context. If a Base * that does not actually point to a Sub is converted to Sub * (which in itself is permitted), then it is not safe to dereference that pointer to access members not shared with Base. In particular, given my declarations above, if the referenced object is in fact a Base, then Base.data being declared in no way prevents ((Sub *)really_a_Base_ptr)->value from producing undefined behavior.

为避免所有未定义和实现定义的行为,您需要一种避免强制转换并确保布局一致的方法.在这方面,@ LoPiTaL建议在您的Sub结构中嵌入类型化的Base结构是一个好方法.

To avoid all undefined and implementation-defined behavior, you want an approach that avoids casting and ensures consistent layout. @LoPiTaL's suggestion to embed a typed Base structure inside your Sub structures is a good approach in that regard.

这篇关于结构类型别名/无联合的标记联合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆