运行多线程时出现双重释放或损坏 [英] double free or corruption when running multithreaded

查看:72
本文介绍了运行多线程时出现双重释放或损坏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的C ++程序遇到运行时错误双重释放或损坏",该错误调用可靠的库ANN,并使用OpenMP来使for循环并行化.

I met a runtime error "double free or corruption" in my C++ program that calls a reliable library ANN and uses OpenMP to parallize a for loop.

*** glibc detected *** /home/tim/test/debug/test: double free or corruption (!prev): 0x0000000002527260 ***     

这是否意味着地址0x0000000002527260的内存被释放了多次?

Does it mean that the memory at address 0x0000000002527260 is freed more than once?

错误发生在"_search_struct-> annkSearch(queryPt,k_max,nnIdx,dists,_eps);"处在函数classify_various_k()中,该函数又在函数tune_complexity()中的OpenMP for循环内部.

The error happens at "_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps);" inside function classify_various_k(), which is in turn inside the OpenMP for-loop inside function tune_complexity().

请注意,当OpenMP有多个线程时,会发生此错误,单线程情况下不会发生此错误.不知道为什么.

Note that the error happens when there are more than one threads for OpenMP, and does not happen in single thread case. Not sure why.

以下是我的代码.如果不足以进行诊断,请告诉我.感谢您的帮助!

Following is my code. If it is not enough for diagnose, just let me know. Thanks for your help!

  void KNNClassifier::train(int nb_examples, int dim, double **features, int * labels) {                         
      _nPts = nb_examples;  

      _labels = labels;  
      _dataPts = features;  

      setting_ANN(_dist_type,1);   

    delete _search_struct;  
    if(strcmp(_search_neighbors, "brutal") == 0) {                                                                 
      _search_struct = new ANNbruteForce(_dataPts, _nPts, dim);  
    }else if(strcmp(_search_neighbors, "kdtree") == 0) {  
      _search_struct = new ANNkd_tree(_dataPts, _nPts, dim);  
      }  

  }  


      void KNNClassifier::classify_various_k(int dim, double *feature, int label, int *ks, double * errors, int nb_ks, int k_max) {            
        ANNpoint      queryPt = 0;                                                                                                                
        ANNidxArray   nnIdx = 0;                                                                                                         
        ANNdistArray  dists = 0;                                                                                                         

        queryPt = feature;     
        nnIdx = new ANNidx[k_max];                                                               
        dists = new ANNdist[k_max];                                                                                

        if(strcmp(_search_neighbors, "brutal") == 0) {                                                                               
          _search_struct->annkSearch(queryPt, k_max,  nnIdx, dists, _eps);    
        }else if(strcmp(_search_neighbors, "kdtree") == 0) {    
          _search_struct->annkSearch(queryPt, k_max,  nnIdx, dists, _eps); // where error occurs    
        }    

        for (int j = 0; j < nb_ks; j++)    
        {    
          scalar_t result = 0.0;    
          for (int i = 0; i < ks[j]; i++) {                                                                                      
              result+=_labels[ nnIdx[i] ];    
          }    
          if (result*label<0) errors[j]++;    
        }    

        delete [] nnIdx;    
        delete [] dists;    

      }    

      void KNNClassifier::tune_complexity(int nb_examples, int dim, double **features, int *labels, int fold, char *method, int nb_examples_test, double **features_test, int *labels_test) {    
          int nb_try = (_k_max - _k_min) / scalar_t(_k_step);    
          scalar_t *error_validation = new scalar_t [nb_try];    
          int *ks = new int [nb_try];    

          for(int i=0; i < nb_try; i ++){    
            ks[i] = _k_min + _k_step * i;    
          }    

          if (strcmp(method, "ct")==0)                                                                                                                     
          {    

            train(nb_examples, dim, features, labels );// train once for all nb of nbs in ks                                                                                                

            for(int i=0; i < nb_try; i ++){    
              if (ks[i] > nb_examples){nb_try=i; break;}    
              error_validation[i] = 0;    
            }    

            int i = 0;    
      #pragma omp parallel shared(nb_examples_test, error_validation,features_test, labels_test, nb_try, ks) private(i)    
            {    
      #pragma omp for schedule(dynamic) nowait    
              for (i=0; i < nb_examples_test; i++)         
              {    
                classify_various_k(dim, features_test[i], labels_test[i], ks, error_validation, nb_try, ks[nb_try - 1]); // where error occurs    
              }    
            }    
            for (i=0; i < nb_try; i++)    
            {    
              error_validation[i]/=nb_examples_test;    
            }    
          }

          ......
     }


更新:


UPDATE:

谢谢!我现在正尝试通过使用"#pragma ompcritical"来纠正在classify_various_k()中写入同一内​​存问题的冲突:

Thanks! I am now trying to correct the conflict of writing to same memory problem in classify_various_k() by using "#pragma omp critical":

void KNNClassifier::classify_various_k(int dim, double *feature, int label, int *ks, double * errors, int nb_ks, int k_max) {   
  ANNpoint      queryPt = 0;    
  ANNidxArray   nnIdx = 0;      
  ANNdistArray  dists = 0;     

  queryPt = feature; //for (int i = 0; i < Vignette::size; i++){ queryPt[i] = vignette->content[i];}         
  nnIdx = new ANNidx[k_max];                
  dists = new ANNdist[k_max];               

  if(strcmp(_search_neighbors, "brutal") == 0) {// search  
    _search_struct->annkSearch(queryPt, k_max,  nnIdx, dists, _eps);  
  }else if(strcmp(_search_neighbors, "kdtree") == 0) {  
    _search_struct->annkSearch(queryPt, k_max,  nnIdx, dists, _eps);  
  }  

  for (int j = 0; j < nb_ks; j++)  
  {  
    scalar_t result = 0.0;  
    for (int i = 0; i < ks[j]; i++) {          
        result+=_labels[ nnIdx[i] ];  // Program received signal SIGSEGV, Segmentation fault
    }  
    if (result*label<0)  
    {  
    #pragma omp critical  
    {  
      errors[j]++;  
    }  
    }  

  }  

  delete [] nnIdx;  
  delete [] dists;  

}

但是,在"result + = _ labels [nnIdx [i]];"处出现新的段错误错误.有什么主意吗?谢谢!

However, there is a new segment fault error at "result+=_labels[ nnIdx[i] ];". Some idea? Thanks!

推荐答案

好的,因为您已经说过它可以在单线程情况下正常工作,所以常规"方法将不起作用.您需要执行以下操作:

Okay, since you've stated that it works correctly on a single-thread case, then "normal" methods won't work. You need to do the following:

  • 找到所有并行访问的变量
  • 尤其是查看已修改的内容
  • 请勿在共享资源上调用删除
  • 查看在共享资源上运行的所有库函数-检查它们是否不进行分配/取消分配
  • find all variables that are accessed in parallel
  • especially take a look at those that are modified
  • don't call delete on a shared resource
  • take a look at all library functions that operate on shared resources - check if they don't do allocation/deallocation

这是被双重删除的候选人的列表:

This is the list of candidates that are double deleted:

shared(nb_examples_test, error_validation,features_test, labels_test, nb_try, ks)

此外,此代码可能不是线程安全的:

Also, this code might not be thread safe:

      for (int i = 0; i < ks[j]; i++) {
         result+=_labels[ nnIdx[i] ]; 
      }    
      if (result*label<0) errors[j]++;  

因为两个或多个进程可能会尝试对错误数组进行写操作.

Because two or more processes may try to do a write to errors array.

还有一个重要建议-尝试在线程模式下不要访问(尤其是修改!)任何东西,这不是函数的参数!

And a big advice -- try not to access (especially modify!) anything while in the threaded mode, that is not a parameter to the function!

这篇关于运行多线程时出现双重释放或损坏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆