在特定的NUMA内存上分配线程的堆栈 [英] Allocating a Thread's Stack on a specific NUMA memory

查看:470
本文介绍了在特定的NUMA内存上分配线程的堆栈的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有一种方法可以在特定的NUMA节点上创建线程的堆栈. 我已经编写了这段代码,但不确定是否能解决问题.

I would like to know if there is a way to create the stack of a thread on a specific NUMA node. I have written this code but i'm not sure if it does the trick or not.

pthread_t thread1;

int main(int argc, char**argv) {        
  pthread_attr_t attr;
  pthread_attr_init(&attr);

  char** stackarray;
  int numanode = 1;

  stackarray = (char**) numa_alloc_onnode(sizeof(char*), numanode);
  // considering that the newly 
  // created thread will be running on a core on node1

  pthread_attr_setstack(&attr, stackarray[0], 1000000);
  pthread_create(&thread1, &attr, function, (void*)0);

  ...
  ...
}

谢谢您的帮助

推荐答案

这是我用于此的代码(略微修改为删除其他地方定义的常量).请注意,我首先正常创建线程,然后从线程内部调用下面的SetAffinityAndRelocateStack().我认为这比尝试创建自己的堆栈要好得多,因为堆栈对到达底部的增长具有特殊的支持.

Here's the code I use for this (slightly adapted to remove some constants defined elsewhere). Note that I first create the thread normally, and then call the SetAffinityAndRelocateStack() below from within the thread. I think this is much better than trying to create your own stack, since stacks have special support for growing in case the bottom is reached.

该代码还可以适于从外部在新创建的线程上运行,但这可能会导致竞争状况(例如,如果线程在其堆栈中执行I/O),因此我不建议这样做. /p>

The code can also be adapted to operate on the newly created thread from outside, but this could give rise to race conditions (e.g. if the thread performs I/O into its stack), so I wouldn't recommend it.

void* PreFaultStack()
{
    const size_t NUM_PAGES_TO_PRE_FAULT = 50;
    const size_t size = NUM_PAGES_TO_PRE_FAULT * numa_pagesize();
    void *allocaBase = alloca(size);
    memset(allocaBase, 0, size);
    return allocaBase;
}

void SetAffinityAndRelocateStack(int cpuNum)
{
    assert(-1 != cpuNum);
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(cpuNum, &cpuset);
    const int rc = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
    assert(0 == rc);

    pthread_attr_t attr;
    void *stackAddr = nullptr;
    size_t stackSize = 0;
    if ((0 != pthread_getattr_np(pthread_self(), &attr)) || (0 != pthread_attr_getstack(&attr, &stackAddr, &stackSize))) {
        assert(false);
    }

    const unsigned long nodeMask = 1UL << numa_node_of_cpu(cpuNum);
    const auto bindRc = mbind(stackAddr, stackSize, MPOL_BIND, &nodeMask, sizeof(nodeMask), MPOL_MF_MOVE | MPOL_MF_STRICT);
    assert(0 == bindRc);

    PreFaultStack();
    // TODO: Also lock the stack with mlock() to guarantee it stays resident in RAM
    return;
}

这篇关于在特定的NUMA内存上分配线程的堆栈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆