如何分配对齐的内存只能使用标准库? [英] How to allocate aligned memory only using the standard library?

查看:117
本文介绍了如何分配对齐的内存只能使用标准库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚完成了测试作为面试的一部分,一个问题难住了我 - 即使使用谷歌,以供参考。我想看看有什么计算器船员可以用它做:

的memset_16aligned功能需要传递给它一个16byte对齐的指针,否则就会崩溃。

a)您将如何分配1024字节的内存,并使其与一个16字节的边界?

二)memset_16aligned执行后释放内存。

  {   void *的纪念品;   无效* PTR;   //接听)在这里   memset_16aligned(PTR,0,1024);   //回答二)在这里}


解决方案

原来的答复

  {
    void *的纪念品=的malloc(1024 + 16);
    无效* PTR =((字符*)内存+ 16)及〜为0x0F;
    memset_16aligned(PTR,0,1024);
    免费(MEM);
}

固定答案

  {
    void *的纪念品=的malloc(1024 + 15);
    无效* PTR =((uintptr_t形式)内存+ 15)及〜(uintptr_t形式)为0x0F;
    memset_16aligned(PTR,0,1024);
    免费(MEM);
}

解释的要求

第一步是分配足够的剩余空间,以防万一。由于内存必须是16字节对齐(即首字节的地址必须是16的倍数),并补充说,我们有足够的空间16额外的字节保证。某处前16个字节,还有一个16字节对齐的指针。 (请注意,的malloc()应该返回已充分为的任何的宗旨相一致的指针。然而,'任何'的意思是主要是对于像基本类型 - 双击长双长长,并指向对象的指针和指向函数,当你正在做更专业的事,像图形系统玩,他们可以根据需要比更严格对齐休息了系统的 - 因此,问题和答案是这样)

下一步是到空隙指针转换为字符指针; GCC尽管如此,你不应该做的空指针(与海合会已警告选项来告诉你,当你滥用它)指针运算。然后添加16到开始指针。假设的malloc()回你一个不可能严重对齐的指针:0x800001。添加16给出0x800011。现在我想向下舍到16字节边界 - 所以我想最后4位为0为0x0F有最后4位设置为一复位;因此,〜为0x0F 已设置为一,除了过去四年所有位。安定与0x800011给0x800010。您可以通过其他偏移迭代,并看到相同的算术工作。

最后一步,免费(),很简单:你一定要只,恢复到免费()的值的malloc(),释放calloc() realloc()的退还给你 - 别的是一场灾难。你提供了正确的存储来保存价值 - 谢谢你。免费发布吧。

最后,如果你知道你的系统的的malloc 包的内部,你可以猜测,这很可能会回到16字节对齐的数据(也可能是8字节对齐)。如果是16字节对齐,那么你不需要用值丁克。然而,这是不可靠的和非便携式 - 其他的malloc 软件包有不同的最低路线,因此假设当它不同的东西,会导致核心转储的一件事。内宽的范围,这个解决方案是便携式的。

别人提到 posix_memalign()的另一种方式来获得对齐的内存;这不是随处可得,但往往可以用这个作为基础来实现。注意,这是方便的,该取向是2的幂;其他路线是混乱。

还有一个注释 - 这code不检查分配成功了

修改

Windows程序员指出,你不能做位掩码操作的指针,而事实上,GCC (3.4.6和4.3.1测试)不抱怨这样。因此,基本的code的修正版 - 转换成一个主程序,如下。我也采取了增加的只有15,而不是16的自由,正如已经指出。我使用 uintptr_t形式以来一直C99足够长的时间是在大多数平台上访问。如果不是在使用 PRIXPTR 的printf()语句,这将是足够的#include&LT; stdint.h&GT; 而不是使用的#include&LT; inttypes.h&GT; [此code包括修复所指出的 CR ,这是重申首先由<做了点A HREF =htt​​p://stackoverflow.com/users/12943/bill-k>比尔ķ一个若干年前,我设法忽略到现在。]

 的#include&LT;&ASSERT.H GT;
#包括LT&;&inttypes.h GT;
#包括LT&;&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括LT&;&string.h中GT;静态无效memset_16aligned(void *的空间,焦炭字节,为size_t的nbytes)
{
    断言((为nbytes&安培;为0x0F)== 0);
    断言(((uintptr_t形式)的空间和放​​大器;为0x0F)== 0);
    memset的(空间,字节,为nbytes); //不是memset的自定义实现()
}INT主要(无效)
{
    void *的纪念品=的malloc(1024 + 15);
    无效* PTR =(无效*)(((uintptr_t形式)内存+ 15)及〜(uintptr_t形式)为0x0F);
    的printf(0X%08PRIXPTR,为0x%08PRIXPTR\\ n(类型uintptr_t)负责,(uintptr_t形式)PTR);
    memset_16aligned(PTR,0,1024);
    免费(MEM);
    返回(0);
}

这里是一个稍微更广义的版本,这将为它们是2的幂大小工作:

 的#include&LT;&ASSERT.H GT;
#包括LT&;&inttypes.h GT;
#包括LT&;&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括LT&;&string.h中GT;静态无效memset_16aligned(void *的空间,焦炭字节,为size_t的nbytes)
{
    断言((为nbytes&安培;为0x0F)== 0);
    断言(((uintptr_t形式)的空间和放​​大器;为0x0F)== 0);
    memset的(空间,字节,为nbytes); //不是memset的自定义实现()
}静态无效test_mask(为size_t对齐)
{
    uintptr_t的面具=〜(uintptr_t形式)(调整 - 1);
    void *的纪念品=的malloc(1024 +调整-1);
    无效* PTR =(无效*)(((uintptr_t形式)内存+对齐-1)及面罩);
    断言((对准及(对齐 - 1))== 0);
    的printf(0X%08PRIXPTR,为0x%08PRIXPTR\\ n(类型uintptr_t)负责,(uintptr_t形式)PTR);
    memset_16aligned(PTR,0,1024);
    免费(MEM);
}INT主要(无效)
{
    test_mask(16);
    test_mask(32);
    test_mask(64);
    test_mask(128);
    返回(0);
}

要转换 test_mask()成一个通用的配置功能,从分配器的单个返回值将不得不EN code释放地址,几个人在他们的答案都表示。

与面试官的问题

乌里说:也许我有[A]读取COM prehension问题今天上午,但若面试问题明确表示:你会如何分配1024字节的内存,你清楚地分配不止于此。那不是从面试官的自动故障?

我的回答是不适合入300个字符的注释...

这要看,我想。我想大多数人(包括我)把这个问题的意思是你会如何分配,其中1024个字节的数据可以存储空间,并在基地址为16字节的倍数。如果面试官真正的意思你怎么可以分配1024个字节(只)​​,并把它16字节对齐,那么选项是有限的。


  • 显然,一种可能性是分配1024个字节,然后给该地址的取向处理';与该方法的问题是,实际可用的空间是不正确确定的(可使用的空间是1008和1024字节之间,但没有一个机构可用来指定大小),这使得它小于有用。
  • 另一种可能是,我们希望你写一个完整的内存分配,确保你返回1024字节块适当对齐。如果是这样的话,你可能最终做的相当类似提出的解决方案做了什么操作,但你藏在里面的分配器

但是,如果面试官预期要么对这些答复的,我希望他们认识到,这种解决方案回答了一个密切相关的问题,然后重新构建他们的问题在正确的方向指向了谈话。 (此外,如果面试官得到了真正难以驾驭,那么我就不会想要的工作;如果回答不充分precise要求在没有矫正火焰击落,那么面试官是不是有人对他们来说是安全的的工作。)

世界上移动

问题的标题最近发生了变化。这是的解决用C面试问题内存对齐是难倒我的。修订后的标题(的如何只使用标准库来分配对齐的内存的?)需要稍微修改的答案 - 本附录提供了它

C11(ISO / IEC 9899:2011)附加的功能 aligned_alloc()


  

7.22.3.1的 aligned_alloc 函数


  
  

简介

 的#include&LT;&stdlib.h中GT;
无效* aligned_alloc(为size_t对齐,为size_t大小);


  
  

说明结果
  在 aligned_alloc 函数的对象,它的定位是分配空间
  通过对齐,其大小由指定的大小,其值在指定
  不定。 对齐的值应是由实现和价值支持的有效调整尺寸应的整数倍对齐


  
  

返回结果
  在 aligned_alloc 函数返回一个空指针或指向分配的空间。


和POSIX定义<一个href=\"http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_memalign.html\"><$c$c>posix_memalign():


 的#include&LT;&stdlib.h中GT;INT posix_memalign(无效** memptr,为size_t对齐,为size_t大小);


  
  

说明


  
  

posix_memalign()功能应分配尺寸指定的边界上字节对齐排列,并须在 memptr 返回一个指向分配的内存。 对齐的值应是两个多的功率的sizeof(无效*)


  
  

成功完成后,该值指向 memptr 对齐的倍数。


  
  

如果该空间的请求的大小为0,则该行为是实现定义;在 memptr 的返回值应是一个空指针或唯一指针。


  
  

免费()功能应释放有previously被分配posix_memalign内存()


  
  

返回值


  
  

成功完成后, posix_memalign()应返回零;否则,应返回一个错误编号以指示错误。


一方或双方的这些可以用来现在回答这个问题,但只有POSIX功能是它在最初回答了这个问题的选项。

在幕后,新的排列记忆功能做很多同样的工作,如问题所述,除非他们必须强制校准更容易,并保持对齐的内存的起始轨迹内部,这样的能力code没有处理特殊 - 它只是释放由已使用的分配函数返回的内存

I just finished a test as part of a job interview, and one question stumped me - even using google for reference. I'd like to see what the stackoverflow crew can do with it:

The "memset_16aligned" function requires a 16byte aligned pointer passed to it, or it will crash.

a) How would you allocate 1024 bytes of memory, and align it to a 16 byte boundary?
b) Free the memory after the memset_16aligned has executed.

{

   void *mem;

   void *ptr;

   // answer a) here

   memset_16aligned(ptr, 0, 1024);

   // answer b) here

}

解决方案

Original answer

{
    void *mem = malloc(1024+16);
    void *ptr = ((char *)mem+16) & ~ 0x0F;
    memset_16aligned(ptr, 0, 1024);
    free(mem);
}

Fixed answer

{
    void *mem = malloc(1024+15);
    void *ptr = ((uintptr_t)mem+15) & ~ (uintptr_t)0x0F;
    memset_16aligned(ptr, 0, 1024);
    free(mem);
}

Explanation as requested

The first step is to allocate enough spare space, just in case. Since the memory must be 16-byte aligned (meaning that the leading byte address needs to be a multiple of 16), adding 16 extra bytes guarantees that we have enough space. Somewhere in the first 16 bytes, there is a 16-byte aligned pointer. (Note that malloc() is supposed to return a pointer that is sufficiently well aligned for any purpose. However, the meaning of 'any' is primarily for things like basic types — long, double, long double, long long, and pointers to objects and pointers to functions. When you are doing more specialized things, like playing with graphics systems, they can need more stringent alignment than the rest of the system — hence questions and answers like this.)

The next step is to convert the void pointer to a char pointer; GCC notwithstanding, you are not supposed to do pointer arithmetic on void pointers (and GCC has warning options to tell you when you abuse it). Then add 16 to the start pointer. Suppose malloc() returned you an impossibly badly aligned pointer: 0x800001. Adding the 16 gives 0x800011. Now I want to round down to the 16-byte boundary — so I want to reset the last 4 bits to 0. 0x0F has the last 4 bits set to one; therefore, ~0x0F has all bits set to one except the last four. Anding that with 0x800011 gives 0x800010. You can iterate over the other offsets and see that the same arithmetic works.

The last step, free(), is easy: you always, and only, return to free() a value that one of malloc(), calloc() or realloc() returned to you — anything else is a disaster. You correctly provided mem to hold that value — thank you. The free releases it.

Finally, if you know about the internals of your system's malloc package, you could guess that it might well return 16-byte aligned data (or it might be 8-byte aligned). If it was 16-byte aligned, then you'd not need to dink with the values. However, this is dodgy and non-portable — other malloc packages have different minimum alignments, and therefore assuming one thing when it does something different would lead to core dumps. Within broad limits, this solution is portable.

Someone else mentioned posix_memalign() as another way to get the aligned memory; that isn't available everywhere, but could often be implemented using this as a basis. Note that it was convenient that the alignment was a power of 2; other alignments are messier.

One more comment — this code does not check that the allocation succeeded.

Amendment

Windows Programmer pointed out that you can't do bit mask operations on pointers, and, indeed, GCC (3.4.6 and 4.3.1 tested) does complain like that. So, an amended version of the basic code — converted into a main program, follows. I've also taken the liberty of adding just 15 instead of 16, as has been pointed out. I'm using uintptr_t since C99 has been around long enough to be accessible on most platforms. If it wasn't for the use of PRIXPTR in the printf() statements, it would be sufficient to #include <stdint.h> instead of using #include <inttypes.h>. [This code includes the fix pointed out by C.R., which was reiterating a point first made by Bill K a number of years ago, which I managed to overlook until now.]

#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void memset_16aligned(void *space, char byte, size_t nbytes)
{
    assert((nbytes & 0x0F) == 0);
    assert(((uintptr_t)space & 0x0F) == 0);
    memset(space, byte, nbytes);  // Not a custom implementation of memset()
}

int main(void)
{
    void *mem = malloc(1024+15);
    void *ptr = (void *)(((uintptr_t)mem+15) & ~ (uintptr_t)0x0F);
    printf("0x%08" PRIXPTR ", 0x%08" PRIXPTR "\n", (uintptr_t)mem, (uintptr_t)ptr);
    memset_16aligned(ptr, 0, 1024);
    free(mem);
    return(0);
}

And here is a marginally more generalized version, which will work for sizes which are a power of 2:

#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void memset_16aligned(void *space, char byte, size_t nbytes)
{
    assert((nbytes & 0x0F) == 0);
    assert(((uintptr_t)space & 0x0F) == 0);
    memset(space, byte, nbytes);  // Not a custom implementation of memset()
}

static void test_mask(size_t align)
{
    uintptr_t mask = ~(uintptr_t)(align - 1);
    void *mem = malloc(1024+align-1);
    void *ptr = (void *)(((uintptr_t)mem+align-1) & mask);
    assert((align & (align - 1)) == 0);
    printf("0x%08" PRIXPTR ", 0x%08" PRIXPTR "\n", (uintptr_t)mem, (uintptr_t)ptr);
    memset_16aligned(ptr, 0, 1024);
    free(mem);
}

int main(void)
{
    test_mask(16);
    test_mask(32);
    test_mask(64);
    test_mask(128);
    return(0);
}

To convert test_mask() into a general purpose allocation function, the single return value from the allocator would have to encode the release address, as several people have indicated in their answers.

Problems with interviewers

Uri commented: Maybe I am having [a] reading comprehension problem this morning, but if the interview question specifically says: "How would you allocate 1024 bytes of memory" and you clearly allocate more than that. Wouldn't that be an automatic failure from the interviewer?

My response won't fit into a 300-character comment...

It depends, I suppose. I think most people (including me) took the question to mean "How would you allocate a space in which 1024 bytes of data can be stored, and where the base address is a multiple of 16 bytes". If the interviewer really meant how can you allocate 1024 bytes (only) and have it 16-byte aligned, then the options are more limited.

  • Clearly, one possibility is to allocate 1024 bytes and then give that address the 'alignment treatment'; the problem with that approach is that the actual available space is not properly determinate (the usable space is between 1008 and 1024 bytes, but there wasn't a mechanism available to specify which size), which renders it less than useful.
  • Another possibility is that you are expected to write a full memory allocator and ensure that the 1024-byte block you return is appropriately aligned. If that is the case, you probably end up doing an operation fairly similar to what the proposed solution did, but you hide it inside the allocator.

However, if the interviewer expected either of those responses, I'd expect them to recognize that this solution answers a closely related question, and then to reframe their question to point the conversation in the correct direction. (Further, if the interviewer got really stroppy, then I wouldn't want the job; if the answer to an insufficiently precise requirement is shot down in flames without correction, then the interviewer is not someone for whom it is safe to work.)

The world moves on

The title of the question has changed recently. It was Solve the memory alignment in C interview question that stumped me. The revised title (How to allocate aligned memory only using the standard library?) demands a slightly revised answer — this addendum provides it.

C11 (ISO/IEC 9899:2011) added function aligned_alloc():

7.22.3.1 The aligned_alloc function

Synopsis

#include <stdlib.h>
void *aligned_alloc(size_t alignment, size_t size);

Description
The aligned_alloc function allocates space for an object whose alignment is specified by alignment, whose size is specified by size, and whose value is indeterminate. The value of alignment shall be a valid alignment supported by the implementation and the value of size shall be an integral multiple of alignment.

Returns
The aligned_alloc function returns either a null pointer or a pointer to the allocated space.

And POSIX defines posix_memalign():

#include <stdlib.h>

int posix_memalign(void **memptr, size_t alignment, size_t size);

DESCRIPTION

The posix_memalign() function shall allocate size bytes aligned on a boundary specified by alignment, and shall return a pointer to the allocated memory in memptr. The value of alignment shall be a power of two multiple of sizeof(void *).

Upon successful completion, the value pointed to by memptr shall be a multiple of alignment.

If the size of the space requested is 0, the behavior is implementation-defined; the value returned in memptr shall be either a null pointer or a unique pointer.

The free() function shall deallocate memory that has previously been allocated by posix_memalign().

RETURN VALUE

Upon successful completion, posix_memalign() shall return zero; otherwise, an error number shall be returned to indicate the error.

Either or both of these could be used to answer the question now, but only the POSIX function was an option when the question was originally answered.

Behind the scenes, the new aligned memory function do much the same job as outlined in the question, except they have the ability to force the alignment more easily, and keep track of the start of the aligned memory internally so that the code doesn't have to deal with specially — it just frees the memory returned by the allocation function that was used.

这篇关于如何分配对齐的内存只能使用标准库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆