C中的别名 [英] Aliases in C

查看:53
本文介绍了C中的别名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我们有一个非常激烈的关于GC的帖子,通常会交换

参数(for,cons等)。


其中一个那些线程,我们遇到了realloc问题。


什么是realloc问题?


好​​吧,它以成功的realloc开头:


char * q = realloc(p,2 * n); //对于n size_t,一个简单的

//指数策略。


此时p是* invalid *,还有它的所有别名。 />

这意味着该对象的所有其他指针都变为

无效,也存储在某些结构中的那些

来缓存它们而不是为速度调用封面函数,例如
,或者当在
/>
程序的堆栈/激活框架:


void someFunction(buffer * p,size_t siz)

{

DoSomeWork1(p,siz);

DoSomeWork2(p,siz);

}


其中


无效DoSomeWork1(缓冲区* p,size_t siz)

{

if(siz< SpaceNeeded){

q = realloc(p,SpaceNeeded);

if(q){

p = q;

}

else

NoMemoryException();

}

//做一些工作F isrt part

}


问题是,(在这个简单的设置中很明显)

调用DoSomeWork2(p,siz) )使用无效指针。


这是别名创建中的一个不守纪律的例子。

一个明智的设计必须考虑缓冲区的任何重新分配,

例如通过修改界面并返回

重新分配的指针...


即别名中的DISCIPLINE创建并保持价值

指针。


众所周知,C程序可能比
$复杂得多b $ b这个简单的例子显示了什么。这里一切都很清楚,很容易看到

但在现实世界中并非如此,其中someFunction是一个巨大的画家

很久以前写的,你忘了可能存在*别名*

该对象。


GC(垃圾收集器)是否有任何帮助?

----- -----------------------------------------


在讨论的热度中,我认为GC

如果像这样使用,可能是一个有价值的帮助:


if(siz< ; SpaceNeeded){

q = GC_malloc(SpaceUsed + SpaceNeeded);

memcpy(q,p,SpaceUsed);

p = NULL;

}

由于GC在找到指向它的指针时永远不会释放它,

乍一看是处理这个问题的更好方法。


但这只是乍一看。实际上,如果我们的

对象有一个别名(比如上面SomeFunction的函数参数)

别名意味着GC会保持对象周围,但是


我们现在实际上有两个对象:

1)由于指向SomeFunction指针的旧对象()

2)DoSomeWork1中新的重新分配对象

显然,当一些函数

在第一个副本和另一个副本与第二个副本一起工作!


因此,在这种情况下,GC并不比realloc好,并且可以生成

甚至是最糟糕的错误它们更难找到。

别名纪律是什么意思?

------------------ --------------------


在C语言中,您可以为对象创建一个具有令人难以置信的容易的别名。

char * a = malloc(1024);

char * b = a;


您甚至可以创建ANONYMOUS别名es,例如当你这样做时:


extern T * externalFunction(T * input);


void someFunction(void)

{

T input_data;


//用值填充input_data

externalFunction(& input_data);

//现在我们为input_data创建了一个匿名别名

}


别名规则意味着externalFunction必须永远不能存储

它在任何情况下都会收到指针。


这可能非常难以实现,但它必须*必须完成。


找出这种错误可能非常困难,因为

它们往往表现为intermitent错误。有时它们会发生,有时会消失。显然,它取决于malloc / realloc / free实现的奇思妙想
和程序内存使用的具体模式


可能realloc不会立即重用内存块。

在这种情况下,这个bug在内存分配系统

重用块之前是不可见的。


当分配器返回指向该块的指针时,可能是

不再使用被覆盖的块的部分

程序......


或者它可能是您突然看到的(经过数小时和数小时的调试)

认为SUDDENLY是一个变量神秘地改变它的价值

没有任何影响!


我有这样的错误。


我不希望有人在这里其中一个!


jacob

Recently, we had a very heated thread about GC with the usual
arguments (for, cons, etc) being exchanged.

In one of those threads, we came into the realloc problem.

What is the realloc problem?

Well, it begins with a successfull realloc:

char *q = realloc(p,2*n); // for n size_t, a simple
// exponential strategy.

At this point p is *invalid*, and also ALL its aliases.

This means that all other pointers to that object have become
invalid, also those that are stored in some structure
to cache them instead of calling a cover function for speed,
for instance, or all those aliases for the object that were
created when a pointer to this object was passed in the
stack/activation frame of a procedure:

void someFunction(buffer *p, size_t siz)
{
DoSomeWork1(p,siz);
DoSomeWork2(p,siz);
}

where

void DoSomeWork1(buffer *p,size_t siz)
{
if (siz < SpaceNeeded) {
q = realloc(p,SpaceNeeded);
if (q) {
p = q;
}
else
NoMemoryException();
}
// Do some work Fisrt part
}

Problem is, (obvious in this simple setup) that
the call to DoSomeWork2(p,siz) uses an invalid pointer.

This is an example of an indiscipline in aliases creation.
A sensible design MUST account for any reallocation of the buffer,
for instance by modifying the interface and returning the
reallocated pointer...

I.e. a DISCIPLINE in aliases creation and keeping the value
of pointers.

As we all know, C programs can be hugely more complex than
what this simple example shows. Here all is clear and easy to see
but not so in the real world where someFunction is a huge mosnter
written ages ago, and you forget that there could exist *aliases*
for that object.

Is the GC (Garbage Collector) of any help here?
----------------------------------------------

In the heat of the discussion at first I thought the GC
could be a valuable help here if used like this:

if (siz < SpaceNeeded) {
q = GC_malloc(SpaceUsed+SpaceNeeded);
memcpy(q,p,SpaceUsed);
p = NULL;
}
Since the GC will never free an object if it finds a pointer to it,
at first sight is a better way to handle this problem.

But this is only at first sight. Actually, if there is an alias for our
object (like in the function parameters to SomeFunction above)
that alias will mean that the GC will keep the object around, BUT

we have now actually TWO objects around:
1) The old object that is pointed to by
the pointer in SomeFunction()
2) The new reallocated object within DoSomeWork1

And obviously hell will appear quickly when some function
works in the first copy and another works with the second one!

So, in this case the GC is no better than realloc, and can produce
even worst bugs since they are MUCH more harder to find.
What does it mean an alias discipline?
--------------------------------------

In C you can create an alias for an object with an incredible easy.
char *a = malloc(1024);
char *b=a;

You can even create ANONYMOUS aliases, for instance when you do:

extern T * externalFunction(T *input);

void someFunction(void)
{
T input_data;

// Fill input_data with values
externalFunction(&input_data);
// Now we have created an anonymous alias for input_data
}

An alias discipline means that externalFunction must NEVER store
that pointer that it receives under any circumstances.

And that can be extremely difficult to do, but it *must* be done.

Finding out this kind of bugs can be extremely hard because
they tend to appear as "intermitent" bugs. Sometimes
they happen, sometimes they disappear. Obviously, it depends
on the whims of the malloc/realloc/free implementation and
on the concrete pattern of memory usage of the program.

It may be that realloc does NOT reuse immediately the memory block.
In that case this bug is invisible until the memory allocation system
reuses the block.

When the allocator returns a pointer to this block, it may be that
the part of the block that is overwritten is no longer used
by the program...

OR it may be that SUDDENLY you see (after hours and hours of debugging)
that SUDDENLY a variable mysteriously changes its value
without any affectation to it!

I have had bugs like this.

I do not wish anyone here one of those!

jacob

推荐答案

jacob navia写道:
jacob navia wrote:

最近,我们有一个关于GC的非常热门的话题,通常是

参数(for,cons等)被交换。
Recently, we had a very heated thread about GC with the usual
arguments (for, cons, etc) being exchanged.



使用GC,GC指示free()何时发生,并且这具有语义
的含义。对于真正的OO系统通常更是如此。在C中它实际上没什么大不了的,因为记忆真的是你唯一关心的事情。

With GC, the GC dictates when free() happens, and this has semantic
implications. Usually much more so for true OO systems. In C its
actually less of a big deal since memory is really the only thing that
you care about.


在其中一个主题中,我们遇到了realloc问题。


什么是realloc问题?


好​​吧,它开始了成功的realloc:


char * q = realloc(p,2 * n); //对于n size_t,一个简单的

//指数策略。


此时p是* invalid *,还有它的所有别名。
In one of those threads, we came into the realloc problem.

What is the realloc problem?

Well, it begins with a successfull realloc:

char *q = realloc(p,2*n); // for n size_t, a simple
// exponential strategy.

At this point p is *invalid*, and also ALL its aliases.



C没有声明*如何实现这些函数*。事实上,对p的分配的

引用计数根本不需要减少。

如果机制减小了大小或者需要新的分配,那么

a必须生成新的分配并存储在q中,否则q是

简单地添加到该内存位置的引用列表中(在

可能被扩展之后,但留在原地)。不可避免之后p =

q;操作,曾经指向的p失去一个参考,并且无论q指向哪一个,都是
。我没有看到问题。


即,您将在功能上实现与q = malloc相同的

(2 * n);免费(p); (在GC系统中,我假设free(p)基本上是一个

no-op)并且只是利用你可以简化它的情况

enlarge( p,2 * n); q = p;


-

Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

C does not state *how* these functions are implemented. In fact the
reference count on the allocation for p need not be decreased at all.
If the mechanism decreases the size or requires a new allocation, then
a new allocation must be generated and stored in q, otherwise q is
simply added to the list of references for that memory location (after
possibly being expanded, but left in place). After the inevitable p =
q; operation, what p used to be pointing at loses one reference, and
whatever q is pointing at gains one. I don''t see the problem.

I.e, you would implement as equivalent in functionality to q = malloc
(2*n); free (p); (in a GC system, I assume that free(p) is basically a
no-op) and just exploit the case where you can just simplify that to
enlarge(p, 2*n); q = p;

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/


我们****** @ gmail.com 写道:

jacob navia写道:
jacob navia wrote:

>>最近,我们有一个非常激烈的关于GC的帖子,其中包含了通常的
参数(for,cons等)。
>>Recently, we had a very heated thread about GC with the usual
arguments (for, cons, etc) being exchanged.




使用GC,GC决定free()何时发生,并且这具有语义
的含义。对于真正的OO系统通常更是如此。在C中它实际上没什么大不了的,因为记忆真的是你唯一关心的事情。



With GC, the GC dictates when free() happens, and this has semantic
implications. Usually much more so for true OO systems. In C its
actually less of a big deal since memory is really the only thing that
you care about.


>>在其中一个主题中,我们遇到了realloc问题。

什么是realloc问题?

嗯,它开始了成功的realloc:

char * q = realloc(p,2 * n); //对于n size_t,一个简单的
//指数策略。

此时p是* invalid *,还有它的所有别名。
>>In one of those threads, we came into the realloc problem.

What is the realloc problem?

Well, it begins with a successfull realloc:

char *q = realloc(p,2*n); // for n size_t, a simple
// exponential strategy.

At this point p is *invalid*, and also ALL its aliases.




C未说明*如何实现这些功能。事实上,对p的分配的

引用计数根本不需要减少。

如果机制减小了大小或者需要新的分配,那么

a必须生成新的分配并存储在q中,否则q是

简单地添加到该内存位置的引用列表中(在

可能被扩展之后,但留在原地)。不可避免之后p =

q;操作,曾经指向的p失去一个参考,并且无论q指向哪一个,都是
。我没有看到问题。



C does not state *how* these functions are implemented. In fact the
reference count on the allocation for p need not be decreased at all.
If the mechanism decreases the size or requires a new allocation, then
a new allocation must be generated and stored in q, otherwise q is
simply added to the list of references for that memory location (after
possibly being expanded, but left in place). After the inevitable p =
q; operation, what p used to be pointing at loses one reference, and
whatever q is pointing at gains one. I don''t see the problem.



那么问题很难看出来。


问题不在于重新分配本身,问题是

,别名存储在别处,使用旧值

p


当然我说话在realloc MOVES对象的情况下,因为

没有大小为new_size的空闲块。在这种情况下,对象

已被MOVED并且所有使用旧参考的指针

指向一个免费的块!!


这个块最终将被malloc / realloc / free系统重用,

并将获得其他变量。


旧指针指向那个然而,memroy区域,然后你有内存覆盖



jacob

Well the problem is difficult to see.

The problem is not with the reallocation itself, the problem is
with the aliases stored elsewhere that use the old value of
p

Of course I speak in the case that realloc MOVES the object because
there is no free block with size new_size. In this case, the object
has been MOVED and all pointers that used the old reference
point to a FREE BLOCK!!

This block will be eventually reused by the malloc/realloc/free system,
and will get to other variables.

The old pointers point to that memroy area however, and then you have
the memory overwrite!

jacob


jacob navia写道:
jacob navia wrote:
we******@gmail.com 写道:

jacob navia写道:
jacob navia wrote:

>最近,我们有一个关于GC的非常热门的话题交换的通常
参数(for,cons等)。
>Recently, we had a very heated thread about GC with the usual
arguments (for, cons, etc) being exchanged.



使用GC,GC指示free()何时发生,并且这具有语义
的含义。对于真正的OO系统通常更是如此。在C中它实际上没什么大不了的,因为记忆真的是你唯一关心的事情。

With GC, the GC dictates when free() happens, and this has semantic
implications. Usually much more so for true OO systems. In C its
actually less of a big deal since memory is really the only thing that
you care about.


>在其中一个主题中,我们遇到了realloc问题。

什么是realloc问题?

嗯,它以成功的realloc开头:

char * q = realloc(p,2 * n); //对于n size_t,一个简单的

//指数策略。

此时p是* invalid *,还有它的所有别名。
>In one of those threads, we came into the realloc problem.

What is the realloc problem?

Well, it begins with a successfull realloc:

char *q = realloc(p,2*n); // for n size_t, a simple
// exponential strategy.

At this point p is *invalid*, and also ALL its aliases.



C没有声明*如何实现这些函数*。事实上,对p的分配的

引用计数根本不需要减少。

如果机制减小了大小或者需要新的分配,那么

a必须生成新的分配并存储在q中,否则q是

简单地添加到该内存位置的引用列表中(在

可能被扩展之后,但留在原地)。不可避免之后p =

q;操作,曾经指向的p失去一个参考,并且无论q指向哪一个,都是
。我没有看到问题。

C does not state *how* these functions are implemented. In fact the
reference count on the allocation for p need not be decreased at all.
If the mechanism decreases the size or requires a new allocation, then
a new allocation must be generated and stored in q, otherwise q is
simply added to the list of references for that memory location (after
possibly being expanded, but left in place). After the inevitable p =
q; operation, what p used to be pointing at loses one reference, and
whatever q is pointing at gains one. I don''t see the problem.



那么问题很难看出来。


问题不在于重新分配本身,问题在于/>
,别名存储在其他地方使用旧值

p


当然我说的是realloc MOVES对象因为

没有大小为new_size的空闲块。在这种情况下,对象

已被MOVED并且使用旧参考的所有指针

指向免费块!


Well the problem is difficult to see.

The problem is not with the reallocation itself, the problem is
with the aliases stored elsewhere that use the old value of
p

Of course I speak in the case that realloc MOVES the object because
there is no free block with size new_size. In this case, the object
has been MOVED and all pointers that used the old reference
point to a FREE BLOCK!!



垃圾收集不能这样做。该块未移动。这是

COPIED。在realloc的GC版本中,你必须打开一个新分配的

可能性,而没有旧的分配

消失。在GC中,可以释放块的唯一方法是

对它的引用消失。这将引入一个新的语义与

我们在典型的非GC reallocs中所拥有的,但那是因为它不是
试图解决那种问题。当然,如果在非GC的情况下,问题就更糟了,因为没有可能恢复已经释放的别名指针。


您是否看过Boehm GC以了解它是如何解决这个问题的?


-

Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Garbage collection can''t work that way. The block is not MOVED. It is
COPIED. In the GC version of realloc, you would have to open the
possiblity of a new allocation being made without the old one
disappearing. In GC the only way a block can be freed is when the
references to it are gone. This would introduce a new semantic versus
what we have in typical non-GC reallocs, but that''s because it isn''t
trying to solve that kind of problem. The problem, of course, is much
worse if in the non-GC case, as there is no possibility of recovery for
aliased pointers that have been freed.

Have you looked at the Boehm GC to see how it solves this problem?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/


这篇关于C中的别名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆