Malloc分段错误 [英] Malloc segmentation fault

查看:81
本文介绍了Malloc分段错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是发生分段错误(未调用错误)的代码段:

Here is the piece of code in which segmentation fault occurs (the perror is not being called):

job = malloc(sizeof(task_t));
if(job == NULL)
    perror("malloc");

更准确地说,gdb说segfault发生在__int_malloc调用内,这是malloc进行的子例程调用.

To be more precise, gdb says that the segfault happens inside a __int_malloc call, which is a sub-routine call made by malloc.

由于malloc函数是与其他线程并行调用的,因此起初我认为这可能是问题所在. 我使用的是glibc 2.19版.

Since the malloc function is called in parallel with other threads, initially I thought that it could be the problem. I was using version 2.19 of glibc.

数据结构:

typedef struct rv_thread thread_wrapper_t;

typedef struct future
{
  pthread_cond_t wait;
  pthread_mutex_t mutex;
  long completed;
} future_t;

typedef struct task
{
  future_t * f;
  void * data;
  void *
  (*fun)(thread_wrapper_t *, void *);
} task_t;

typedef struct
{
  queue_t * queue;
} pool_worker_t;

typedef struct
{
  task_t * t;
} sfuture_t;

struct rv_thread
{
  pool_worker_t * pool;
};

现在要实施:

future_t *
create_future()
{
  future_t * new_f = malloc(sizeof(future_t));
  if(new_f == NULL)
    perror("malloc");
  new_f->completed = 0;
  pthread_mutex_init(&(new_f->mutex), NULL);
  pthread_cond_init(&(new_f->wait), NULL);
  return new_f;
}

int
wait_future(future_t * f)
{
  pthread_mutex_lock(&(f->mutex));
  while (!f->completed)
    {
      pthread_cond_wait(&(f->wait),&(f->mutex));
    }
  pthread_mutex_unlock(&(f->mutex));
  return 0;
}

void
complete(future_t * f)
{
  pthread_mutex_lock(&(f->mutex));
  f->completed = 1;
  pthread_mutex_unlock(&(f->mutex));
  pthread_cond_broadcast(&(f->wait));
}

线程池本身:

pool_worker_t *
create_work_pool(int threads)
{
  pool_worker_t * new_p = malloc(sizeof(pool_worker_t));
  if(new_p == NULL)
    perror("malloc");
  threads = 1;
  new_p->queue = create_queue();
  int i;
  for (i = 0; i < threads; i++){
    thread_wrapper_t * w = malloc(sizeof(thread_wrapper_t));
    if(w == NULL)
      perror("malloc");
    w->pool = new_p;
    pthread_t n;
    pthread_create(&n, NULL, work, w);
  }
  return new_p;
}

task_t *
try_get_new_task(thread_wrapper_t * thr)
{
  task_t * t = NULL;
  try_dequeue(thr->pool->queue, t);
  return t;
}

void
submit_job(pool_worker_t * p, task_t * t)
{
  enqueue(p->queue, t);
}

void *
work(void * data)
{
  thread_wrapper_t * thr = (thread_wrapper_t *) data;
  while (1){
    task_t * t = NULL;
    while ((t = (task_t *) try_get_new_task(thr)) == NULL);
    future_t * f = t->f;
    (*(t->fun))(thr,t->data);
    complete(f);
  }
  pthread_exit(NULL);
}

最后是task.c:

pool_worker_t *
create_tpool()
{
  return (create_work_pool(8));
}

sfuture_t *
async(pool_worker_t * p, thread_wrapper_t * thr, void *
(*fun)(thread_wrapper_t *, void *), void * data)
{
  task_t * job = NULL;
  job = malloc(sizeof(task_t));
  if(job == NULL)
    perror("malloc");
  job->data = data;
  job->fun = fun;
  job->f = create_future();
  submit_job(p, job);
  sfuture_t * new_t = malloc(sizeof(sfuture_t));
  if(new_t == NULL)
    perror("malloc");
  new_t->t = job;
  return (new_t);
}

void
mywait(thread_wrapper_t * thr, sfuture_t * sf)
{
  if (sf == NULL)
    return;
  if (thr != NULL)
    {
      while (!sf->t->f->completed)
        {
          task_t * t_n = try_get_new_task(thr);
          if (t_n != NULL)
            {
          future_t * f = t_n->f;
          (*(t_n->fun))(thr,t_n->data);
          complete(f);
            }
        }
      return;
    }
  wait_future(sf->t->f);
  return ;
}

该队列是lfds无锁队列.

The queue is the lfds lock-free queue.

#define enqueue(q,t) {                                 \
    if(!lfds611_queue_enqueue(q->lq, t))             \
      {                                               \
        lfds611_queue_guaranteed_enqueue(q->lq, t);  \
      }                                               \
  }

#define try_dequeue(q,t) {                            \
    lfds611_queue_dequeue(q->lq, &t);               \
  }

只要异步调用次数很高,就会发生此问题.

The problem happens whenever the number of calls to async is very high.

Valgrind输出:

Valgrind output:

Process terminating with default action of signal 11 (SIGSEGV)
==12022==  Bad permissions for mapped region at address 0x5AF9FF8
==12022==    at 0x4C28737: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

推荐答案

我已经弄清了问题所在:堆栈溢出.

I've figured out what the problem is: a stack overflow.

首先,让我解释一下为什么在malloc内部发生堆栈溢出(这可能就是您正在阅读此内容的原因).当我的程序运行时,每次开始(递归地)执行另一个任务时(由于我编程的方式),堆栈大小一直在增加.但是对于每个这样的时间,我不得不使用malloc分配一个新任务.但是,malloc进行其他子例程调用,这使堆栈增加了其大小,这比执行另一个任务的简单调用还要多.因此,发生的事情是,即使没有malloc,我也会出现堆栈溢出.但是,因为我有malloc,所以在堆栈溢出的那一刻就在malloc中了,然后再通过进行另一个递归调用使其溢出. 下面的插图显示了正在发生的事情:

First, let me explain why the stack overflow occurs inside malloc (which is probably why you are reading this). When my program was run, the stack size kept increasing each time it started executing (recursively) another task (because of the way I had programmed it). But for each such time, I had to allocate a new task using malloc. However, malloc makes other sub-routine calls, which make the stack increase its size even more than a simple call to execute another task. So, what was happening was that, even if there was no malloc, I would get a stack overflow. However, because I had malloc, the moment the stack overflowed was in malloc, before it overflowed by making another recursive call. The illustration bellow shows what was happening:

初始堆栈状态:

-------------------------
| recursive call n - 3  |
-------------------------
| recursive call n - 2  |
-------------------------
| recursive call n - 1  |
-------------------------
|        garbage        |
-------------------------
|        garbage        | <- If the stack passes this point, the stack overflows.
-------------------------

在malloc调用期间的堆栈:

stack during malloc call:

-------------------------
| recursive call n - 3  |
-------------------------
| recursive call n - 2  |
-------------------------
| recursive call n - 1  |
-------------------------
|        malloc         |
-------------------------
|     __int_malloc      | <- If the stack passes this point, the stack overflows.
-------------------------

然后堆栈再次收缩,我的代码进入了一个新的递归调用:

Then the stack shrank again, and my code entered a new recursive call:

-------------------------
| recursive call n - 3  |
-------------------------
| recursive call n - 2  |
-------------------------
| recursive call n - 1  |
-------------------------
| recursive call n      |
-------------------------
|        garbage        | <- If the stack passes this point, the stack overflows.
-------------------------

然后,它在此新的递归调用中再次调用了malloc.但是,这一次它溢出了:

Then, it invoked malloc again inside this new recursive call. However, this time it overflowed:

-------------------------
| recursive call n - 3  |
-------------------------
| recursive call n - 2  |
-------------------------
| recursive call n - 1  |
-------------------------
| recursive call n      |
-------------------------
|        malloc         | <- If the stack passes this point, the stack overflows.
-------------------------
|     __int_malloc      | <- This is when the stack overflow occurs.
-------------------------

[答案的其余部分更加着重于为什么我的代码中特别存在此问题.]

[The rest of the answer is more focused around why I had this problem in my code in particular.]

通常,当递归计算斐波那契数为n时,堆栈大小随该数字线性增长. 但是,在这种情况下,我正在创建任务,使用队列来存储它们,并让(fib)任务出队执行.如果将其绘制在纸上,您会看到任务数与n呈指数增长,而不是线性增长(还请注意,如果我使用堆栈来存储创建时的任务,则分配的任务数为以及堆栈大小只会随n线性增长.所以发生的事情是堆栈随n呈指数增长,从而导致堆栈溢出...现在介绍了为什么此溢出发生在对malloc的调用内的部分.上面我已经解释了,堆栈溢出发生在malloc调用内部,因为它是堆栈最大的地方,所发生的事情是堆栈几乎爆炸了,并且由于malloc调用了其中的函数,所以堆栈的增长不仅仅是mywait和fib.

Usually, when computing Fibonacci recursively, for example, of a certain number n, the stack size grows linearly with that number. However, in this case I'm creating tasks, using a queue to store them, and dequeuing a (fib) task for execution. If you draw this on paper, you'll see that the number of tasks grows exponentially with the n, rather than linearly (also note that if I had used a stack to store the tasks as they were created, the number of tasks allocated as well as the stack size would only grow linearly with n. So what happens is that the stack grows exponentially with n, leading to a stack overflow... Now comes the part why this overflow occurs inside the call to malloc. So basically, as I explained above, the stack overflow happened inside the malloc call because it was where the stack was largest. What happened was that the stack was almost exploding, and since malloc calls functions inside it, the stack grows more than just the calling of mywait and fib.

谢谢大家!如果不是您的帮助,我将无法解决!

Thank you all! If it wasn't your help i wouldn't be able to figure it out!

这篇关于Malloc分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆