尝试将GNU GMP库中的类型用作Bison的yylval类型时出错 [英] Error when attempting to use a type from GNU's GMP library as the yylval type for Bison

查看:64
本文介绍了尝试将GNU GMP库中的类型用作Bison的yylval类型时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过在Bison文件中包含以下内容,将GMP库中的 mpz_t 类型用作 yylval 的类型:

I'm attempting to use the type mpz_t from the GMP library as the type for yylval by including the following in the Bison file:

%define api.value.type {mpz_t}

我检查了生成的解析器,它正确地生成了 typedef mpz_t YYSTYPE 行,随后将 YYSTYPE 用于创建 yylval .

I checked the generated parser and it correctly generates the line typedef mpz_t YYSTYPE, with YYSTYPE later being used to create yylval.

mpz_t 在GMP头文件 gmp.h 中通常定义为 typedef __mpz_struct mpz_t [1]; .依次将 __ mpz_struct 定义为

mpz_t is typdefed as typedef __mpz_struct mpz_t[1]; in the GMP header file gmp.h. In turn, __mpz_struct is typdefed as

typedef struct
{
    // struct members here - don't believe they're important
} __mpz_struct;

Bison运行没有错误,但是每当我尝试创建可执行文件时,都会出现以下错误:

Bison runs without error, but whenever I try to create an executable I get the following error:

calc.tab.c:在"yyparse"函数中:

calc.tab.c: In function ‘yyparse’:

calc.tab.c:1148:12:错误:从类型分配为类型"YYSTYPE"时不兼容的类型‘struct __mpz_struct *’

calc.tab.c:1148:12: error: incompatible types when assigning to type ‘YYSTYPE’ from type ‘struct __mpz_struct *’

* ++ yyvsp = yylval;

*++yyvsp = yylval;

yyvsp 被定义为指向 YYSTYPE 的指针.

yyvsp is defined as a pointer to YYSTYPE.

关于如何解决此问题的任何想法?

Any idea of how to fix this?

推荐答案

正如您所说, mpz_t 被定义为数组类型的别名:

As you say, mpz_t is typedef'ed as an alias for an array type:

typedef __mpz_struct mpz_t[1];

结果,分配给 mpz_t 类型的变量是非法的:

As a result, assignment to a variable of type mpz_t is illegal:

mpz_t a, b;
mpz_init(b);
a = b;  /* Error: incompatible types when assigning to type ‘mpz_t’ */
        /* from type ‘struct __mpz_struct *’                        */

相反,有必要使用内置的分配功能之一:

Instead, it is necessary to use one of the built-in assignment functions:

mpz_t a, b;
mpz_inits(a, b, 0);
mpz_set(a, b);   /* a is now a copy of b */

由于gmp管理内存的方式,必须禁止直接分配给 mpz_t .请参阅下面的注释1.

The ban on direct assignment to an mpz_t is necessary because of the way gmp manages memory. See Note 1 below.

Bison假定可以将语义类型 YYSTYPE 分配给它(请参见注释2),这意味着它不能是数组类型.通常这不是问题,因为通常 YYSTYPE 是联合类型,并且可以将具有数组成员的联合分配给该联合类型.因此,只要将数组类型包含在%union 声明中,就可以在bison中使用数组类型.

Bison assumes that the semantic type YYSTYPE can be assigned to (see note 2), which means that it cannot be an array type. That's normally not a problem, because normally YYSTYPE is a union type, and a union with an array member can be assigned to. So there is no problem using an array type with bison provided that you include it in a %union declaration.

但是您绝对不能使用gmp这样做,因为尽管它将编译,但将无法工作.您最终将导致大量的内存泄漏,并且最终可能会出现晦涩的错误,其中gmp会计算错误的值(或者以更明显的方式失败,例如 free 从下面释放内存 mpz_t ).

But you MUST NOT do that with gmp, because although it will compile, it won't work. You'll end up with lots of leaked memory and you're likely to end up with obscure bugs, where gmp computes the wrong values (or fails in more obvious ways, like freeing memory out from under an mpz_t).

可以直接将 mpz_t 对象用作语义值,但这并不容易.您最终将花费大量时间来思考哪些堆栈插槽具有已初始化的语义值.哪些具有需要 mpz_clear 的值,以及许多其他麻烦的细节.

Using mpz_t objects directly as semantic values is possible, but it will not be easy. You will end up spending a lot of time thinking about which stack slots have semantic values which have been initialized; which ones have values which need to be mpz_cleared, and a lot of other troubling details.

一个更简单(但不简单)的解决方案是使语义值成为指向 mpz_t 的指针.如果您只是制作一个bignum计算器,则可以完全绕开语义值并维护自己的值堆栈.只要每个归约操作从值堆栈中弹出其所有参数并推送其结果,该方法就可以解决.

A simpler (but not simple) solution is to make the semantic value a pointer to an mpz_t. If you are just making a bignum calculator, you could bypass the semantic value altogether and maintain your own value stack. That will work out as long as every reduction action pops all its arguments from the value stack and pushes its result.

此值堆栈也将是 mpz_t 值的向量,但它在某些重要方面与解析器堆栈有所不同,因为它完全在您的控制之下:

This value stack would also be a vector of mpz_t values, but it differs from the parser stack in several important ways, because it is entirely under your control:

  1. 您没有义务创造野牛需要创造的临时价值(见注释2).例如,如果要进行加法运算,这会将两个操作数从堆栈中弹出并将结果推回去,则可以执行以下操作:

  1. You are under no obligation to create the temporary value which bison needs to create (see note 2). If you want to do an addition, for example, which would pop two operands off the stack and push the result back on, you could just do that:

mpz_add(val_stack[top - 2], val_stack[top - 2], val_stack[top - 1]);
--top;

  • 您可以在解析之前初始化值堆栈,并在解析完成后清除所有元素.这使内存管理更加简单,并使您可以重用分配的肢体向量.

  • You can initialize the value stack before parsing, and clear all elements when the parse is done. That makes memory management much simpler, and lets you reuse allocated limb vectors.

    没有相关语义值的运算符和括号之类的标记不会在值堆栈上占用空间.这样做不会节省太多空间,但是却避免了初始化和清除堆栈插槽的需求,而堆栈插槽中根本没有有用的数据.

    Tokens like operators and parentheses, which have no associated semantic value, do not occupy space on the value stack. That doesn't save much space, but it avoids the need to initialize and clear stack slots which never have useful data in them.

    注释

    1.为什么GMP不鼓励直接分配

    根据gmp手册,将大小为1的 mpz_t (和其他类似类型)数组仅用于补偿C缺少传递引用的情况.由于数组在用作函数参数时会衰减为指针,因此无需显式标记参数即可获得传递引用.但是一定有人想过,使用数组类型还可以防止直接分配给 mpz_t .由于gmp管理内存的方式,直接分配无法正常工作.

    Notes

    1. Why GMP discourages direct assignment

    According to the gmp manual, making mpz_t (and other similar types) arrays of size 1 was simply to compensate for C's lack of pass-by-reference. Since the array decays to a pointer when used as a function argument, you get pass-by-reference without having to explicit mark the argument. But it must have crossed someone's mind that the use of an array type also prevents direct assignment to an mpz_t. Direct assignment cannot work because of the way gmp manages memory.

    Gmp值必须包含对已分配存储的引用.(必然,因为bignum的大小没有限制,所以不同的bignum的大小也不同.)通常,有两种方法可以管理这样的对象:

    Gmp values necessarily include a reference to allocated storage. (Necessarily, because there is no limit to the size of a bignum, so different bignums are different sizes.) Generally speaking, there are two ways of managing objects like that:

    1. 使对象不可变.然后可以任意共享,因为不可能进行修改.

    1. Make the object immutable. Then it can be shared arbitrarily, because no modification is possible.

    总是在分配时复制对象,从而使共享变得不可能.然后可以修改对象而不会影响任何其他对象.

    Always copy the object on assignment, making sharing impossible. Then objects can be modified without affecting any other object.

    例如,这两种策略以Java和C ++字符串方法为例.不幸的是,两种策略都依赖于该语言的某些基础架构:

    These two strategies are exemplified by the Java and C++ approaches to strings, for example. Unfortunately, both strategies depend on some infrastructure in the language:

    • 不可变的字符串需要垃圾回收.没有垃圾收集器,就无法判断何时可以释放字符串存储.可以对内存分配进行引用计数,但是引用计数需要增加和减少,并且除非您准备使代码成为引用计数维护的沼泽,否则就需要一些语言支持.

    • Immutable strings require garbage collection. Without a garbage collector, there is no way to tell when the storage for a string can be released. It would be possible to reference count the memory allocations, but the reference counts need to be incremented and decremented and unless you're prepared to make your code a swamp of reference count maintenance, you need some language support.

    复制字符串需要重写赋值运算符.这在C ++中是可能的,但是很少有其他语言如此灵活.

    Copying strings requires overriding the assignment operator. That's possible in C++, but few other languages are so flexible.

    以上两种策略也都存在性能问题.

    There are also performance problems with both of the above strategies.

    • 不可修改的对象在修改时需要复制,这会将简单的线性复杂度转换为二次复杂度.对Java或Python字符串进行重复追加是一个众所周知的问题.Java的StringBuilder旨在弥补这一问题.不可变的整数会很烦人;累积总和非常常见,例如( sum + = value; ),并且每次通过这样的循环都必须复制 sum 会极大地降低循环速度.

    • Immutable objects need to be copied when modified, which can turn simple linear complexity into quadratic complexity. This is a well-known problem with repetitive appends to Java or Python strings; Java's StringBuilder is intended to compensate for this issue. Immutable integers would be annoying; it is very common to accumulate sums, for example (sum += value;), and having to copy sum each time through such a loop could drastically slow down the loop.

    另一方面,在赋值时强制执行复制操作将导致无法共享常量,甚至无法重新排列向量.这可能会导致大量额外的复制,再次导致线性算法变成二次算法.

    On the other hand, forcing copy on assignment makes it impossible to share constants, or even to rearrange vectors. That can cause a lot of extra copying, again leading to linear algorithms turning into quadratic algorithms.

    Gmp选择了可变对象策略.Bignums 必须在赋值时复制,并且由于C不允许覆盖赋值运算符,因此最简单的解决方案是禁止使用赋值运算符,从而强制使用库函数.

    Gmp chose the mutable object strategy. Bignums must be copied on assignment, and since C does not allow overriding the assignment operator, the easiest solution was to ban use of the assignment operator, forcing the use of library functions.

    由于有时在不复制的情况下移动bignum很有用-例如,改组bignum数组-gmp还提供了交换功能.而且,如果您非常小心并且比我更了解gmp的内部结构,则可能仅使用上面提到的 union hack或使用 memcpy(),以便对gmp对象进行更复杂的重新排列,前提是您要保持重要的不变式:

    Since there are occasions when it is useful to move bignums around without copying -- shuffling an array of bignums, for example -- gmp also provides a swap function. And, if you are very very careful and know more about gmp's internals than I do, it is probably possible to just use the union hack mentioned above, or to use memcpy(), in order to do a more complicated rearrangement of gmp objects, provided you maintain the important invariant:

    每个肢体矢量都必须仅由一个 mpz_t 对象引用.

    Every vector of limbs must be referenced by precisely one and only one mpz_t object.

    重要的原因是gmp会在必要时使用realloc调整bignum的大小.假设 a b mpz_t ,我们使用一些技巧使它们都成为相同的bignum,共享内存:

    The reason that is important is that gmp will resize a bignum if necessary, using realloc. Suppose that a and b are mpz_t, and we use some hack to make them both the same bignum, sharing memory:

    memcpy(a, b, sizeof(a));
    

    现在,我们将 b 变得更大:

    Now, we make b much bigger:

    mpz_mul(b, b, b);  /* Set b to b squared */
    

    这会很好,但在内部它会做类似的事情

    This will work fine, but internally it will do something like

    tmp = realloc(b->_mp_d, 2 * b->_mp_size);
    if (tmp) b->_mp_d = tmp;
    

    以便使 b 足够大以容纳结果.这对于 b 可以很好地工作,但是可能会导致 a 指向的肢体陷入困境,因为成功的 realloc 可以分配新的存储空间将自动释放旧存储空间.

    in order to make b large enough to hold the result. That will work fine for b, but it could result in the limbs being pointed by a going into limbo, since a successful realloc which allocates new storage will automatically free the old storage.

    任何增加 b 大小的操作都会发生同样的事情;将其平方就位只是一个例子.在几乎任何修改后, a 都可能以悬挂指针结尾,从而增加了 b 的大小: mpz_add(b,tmp1,tmp2); (假设 tmp1 和/或 tmp2 大于 b .)

    The same thing would happen with any operation which increased the size of b; squaring it in place was just an example. a could end up with a dangling pointer after almost any modification which increases the size of b: mpz_add(b, tmp1, tmp2); (assuming tmp1 and/or tmp2 are larger than b.)

    Bison为每次还原创建一个临时 YYSTYPE 对象;这个临时变量是野牛动作中表示为 $$ 的实际变量.在执行归约操作之前,解析器将执行 $$ = $ 1; 的等效项.操作完成后,将 $ 1 $ n 弹出堆栈,并将 $$ 压入堆栈.实际上,这会用 $$ 覆盖旧的 $ 1 ,这就是为什么必须使用临时文件的原因.(否则,在操作中设置 $$ 将使 $ 1 无效.)

    Bison creates a temporary YYSTYPE object for every reduction; this temporary is the actual variable represented as $$ in the bison action. Before the reduction action is executed, the parser executes the equivalent of $$ = $1;. Once the action has been completed, $1 through $n are popped off the stack, and $$ is pushed onto it. In effect, this overwrites the old $1 with $$, which is why a temporary must be used. (Otherwise, setting $$ in an action would surprisingly invalidate $1.)

    这篇关于尝试将GNU GMP库中的类型用作Bison的yylval类型时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆