多功能的功能 [英] Multithreding for a function

查看:70
本文介绍了多功能的功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用VS2010 Prof在C中有一个函数,它读取1个数据文件(来自目录中的1000个数据文件),做一些分析并将一些数据文件写入另一个目录,然后继续读写到最后。我的函数是myfunc(int start_index,int end_index,char * read_file_directory)。为了改善运行时,我使用_beginthread创建了多个线程(可选),每个线程调用myfunc。作为5个线程的示例,我正在做类似的事情:

线程1:myfunc(1,200,read_directory)

线程2:myfunc(201,400,read_directory)

线程3:myfunc(401,600,read_directory)

线程4:myfunc(6011,800,read_directory)

线程5:myfunc(801,1000,read_directory) )

myfunc工作得非常好,超过1000个文件,我用Visual Leak Detector 2.3检查了它,但它没有显示内存泄漏。一旦使用多线程,我的程序就会给我访问冲突。

NR_test_5_KES.exe中0x00fa8c15处的未处理异常:0xC0000005:访问冲突写入位置0x0102c7b8。

我是使用CLAPACK(32位发行版)和fftw3f。 myfunc必须在开头打开它的唯一常见文件是ReadTestParam:

WaitForSingleObject(ghMutex,INFINITE);

Param = ReadTestPara(base_path);

ReleaseMutex(ghMutex);

所以我使用上面的代码隔离了readtestpara。

myfunc是threadfunc中调用的主函数。如果在创建线程和在它们之间分配作业有任何问题,请告诉我。我想到了一点;在myfunc中我分配了许多数组(总共大约100M左右),另一个在myfunc中调用的函数分配另一个大约100M(包括许多数组和fftw计划并在代码中释放它们)内存然后释放它。所以,我想我需要更多内存,如果我想使用4到7个线程(我的CPU是corei7)所以我改变了项目设置来访问大地址(所有外部库和VS2010都是32位,但我的窗口是64位)。 br />
这是我的代码的主要部分:

I have a function in C using VS2010 Prof, which reads 1 data file (from 1000 data files in a directory), do some analysis and writes some data files in another directory, then continues reading and writing to the end. My function is myfunc(int start_index,int end_index,char *read_file_directory). To improve the runtimes I created number of threads(optional) using _beginthread each of which calls myfunc. As an example for 5 threads I am doing something like:
thread 1: myfunc(1,200,read_directory)
thread 2: myfunc(201,400,read_directory)
thread 3: myfunc(401,600,read_directory)
thread 4: myfunc(6011,800,read_directory)
thread 5: myfunc(801,1000,read_directory)
myfunc works very well over 1000 files and I checked it with Visual Leak Detector 2.3, however it shows no memory leak. As soon as using multithreading, my program gives me access violation.
"Unhandled exception at 0x00fa8c15 in NR_test_5_KES.exe: 0xC0000005: Access violation writing location 0x0102c7b8."
I am using CLAPACK (32bit release version) and fftw3f. The only common file that myfunc has to open it at the beginning is ReadTestParam:
WaitForSingleObject(ghMutex, INFINITE);
Param=ReadTestPara(base_path);
ReleaseMutex(ghMutex);
so I isolated readtestpara using the above code.
myfunc is the main function called in threadfunc. Please let me know if there is anything wrong in creating the threads and dividing the job between them. One point comes to my mind; in myfunc I allocate many arrays (totally around 100M more or less)and another function which is called inside myfunc allocates another around 100M(including many arrays and fftw plans and freeing them during the code)memory and then frees it. So, I was thinking I need more memory if I wanted to use 4 to 7 threads(my CPU is corei7) so I changed the project setting to access large address (all external libraries and VS2010 are 32bits but my windows is 64bit).
This is the main part of my code:

 #include <stdlib.h>
 #include <stdio.h>
 #include <math.h>
 #include <time.h>
 #include <string.h>
 #include <process.h>
 #define WIN32_LEAN_AND_MEAN
 #include <windows.h>
 #include "nrutil.h"
 #include "nr.h"
 #include "fftw3.h"
 #include "complex.h"
 #include "f2c.h"
 #include "clapack.h"
 //#include <vld.h>

HANDLE ghMutex; 
typedef float elem_type ;
#define ELEM_SWAP(a,b) { register elem_type t=(a);(a)=(b);(b)=t; }
elem_type kth_smallest(elem_type a[], int n, int k)
{
    register i,j,l,m ;
    register elem_type x ;

    l=0 ; m=n-1 ;
    while (l<m)>        x=a[k] ;
        i=l ;
        j=m ;
        do {
            while (a[i]<x)>            while (x<a[j])>            if (i<=j) {
                ELEM_SWAP(a[i],a[j]) ;
                i++ ; j-- ;
            }
        } while (i<=j) ;
        if (j<k)>        if (k<i)>    }
    return a[k] ;
}
#define median(a,n) kth_smallest(a,n,(((n)&1)?((n)/2): (((n)/2)-1)))

struct shift{
	float shift1;
	float shift2;
};

struct Parameters {
float W0;
float W1;
float W2;
float W3;
int pixel;           
int Alines;
int BScans;        
struct shift Sh;
int shiftdown;
int depth;
int z1;
int z2;
int edge_cut;
int BDfilter;
int adj;
int w;
};

struct Threadinputs{
int start_idx;
int end_idx;
int num_th;
char base_path[300];
};

void Threadfunc(void *pParam);
int main();
//--------------------------------
struct Parameters ReadTestPara(char *data_dir);
void median2(float image[], int N_m, int M_m, int Median);
#define NRANSI
void read_surface(int Alines, int edge_cut,char *filename,int *surface1);
void PSOCT_RSurf(float *img,float *new_img,int *surface,int B_net,int depth,int z1, int z2);
void interpolation_1(int pixel, float KES[],float K[],float W0,float W1,float W2,float W3);
void interp1(int pixel, int lines2,float K[],float KES[],float **specr,float **specl);
void PSOCT_CUM(char *filename,int *surface1,int pixel, int Alines, int BDfilter, int edge_cut, int depth, int shiftdown, float K[],float KES[],fftwf_complex *JQWPI,fftwf_complex phaseshift,fftwf_complex *V11,fftwf_complex *V12,fftwf_complex *V21,fftwf_complex *V22,fftwf_complex *DD11,fftwf_complex *DD22,float *Intensity);
void spline(float x[], float y[], int n, float yp1, float ypn, float y2[]);
float *vector(long nl, long nh);
void free_vector(float *v, long nl, long nh);
void nrerror(char error_text[]);
void read_data(int pixel,int lines,int motion_cut,char *filename,float **specr,float **specl);
void splint(float xa[], float ya[], float y2a[], int n, float x, float *y);
void myfunc(int start_idx,int end_idx,int th_num,char *base_path);

int main()
{
 
	  int  start_idx,end_idx;
	  HANDLE *h;
	  struct Threadinputs in[20];
	  int i,tn,step,lstep;
	  char base_path[300];

	  ghMutex = CreateMutex( NULL,FALSE,NULL); 
	  printf_s("Please enter the base directory (put \\ at the end): \n\n");
	  scanf("%s", base_path,_countof(base_path));
	  printf_s("\nPlease enter the start index bigger than 3: ");
	  scanf_s("%d", &start_idx);
	  printf_s("\nPlease enter the end index: ");
	  scanf_s("%d", &end_idx);
         printf_s("\nPlease select the number of threads: ");
	  scanf_s("%d", &tn);

	  step=(end_idx-start_idx+1)/(tn);
         lstep=(end_idx-start_idx+1)-(tn-1)*step;
	  printf_s("\nTHREAD\tnum\tstart\tend");
	  //_____________________________________________________________________________
	
         h=(HANDLE *)malloc(sizeof(HANDLE)*tn);
	  if (h == NULL){printf_s("HANDLE: No Memory!!! ");}
	   //================================================98

	
	
	
   for(i=0;i<tn-1;i++)>
   {
	strcpy_s(in[i].base_path,300,base_path);
	in[i].start_idx=i*step+start_idx;
	in[i].end_idx=in[i].start_idx+step-1;
	in[i].num_th=i;
	h[i]=(HANDLE)_beginthread(Threadfunc,0,&in[i]);
   }
	strcpy_s(in[i].base_path,300,base_path);
	in[i].start_idx=in[i-1].start_idx+step;
	in[i].end_idx=in[i-1].end_idx+lstep;
	in[i].num_th=i;
	h[i]=(HANDLE)_beginthread(Threadfunc,0,&in[i]);

	 WaitForMultipleObjects( 
      tn,           // number of objects in array
      h,     // array of objects
      TRUE,       // wait for any object
      INFINITE); 
	  free(h);
	  getchar();
      return 0;
}

//-------------------------------
void Threadfunc(void *pParam)
{
	struct Threadinputs *pin = (struct Threadinputs *)pParam;
	printf_s("\nTHREAD\t%d\t%d\t%d",(*pin).num_th,(*pin).start_idx,(*pin).end_idx);
	myfunc((*pin).start_idx,(*pin).end_idx,(*pin).num_th,(*pin).base_path);
}
//------------------------------
void myfunc(int start_idx,int end_idx,int th_num,char *base_path)
{     errno_t err;

	  char path_surf[300];
	  char file_path[300];
	  char cum_path [300];
	  int index,frame,adj;
	  int B_net,num_avg;
	  int Ls;
	  int n,n1,n2,j,k,z1,z2;
	  double d;
//------------------------------------------	  
	float W0,W1,W2,W3;
	int BDfilter;
	int pixel;
	int Alines;
	int edge_cut;
	int depth,L2;
	fftwf_complex  JQWPI[4];
	fftwf_complex  phaseshift;
	struct Parameters Param;
	float PI=3.14159265;
	int i,L;
	int shiftdown;

	FILE *fid1,*fid2;
	int *surface1;
	float *KES;
	float *K;
	float *Intensity;
	float *shift_vec;
	float *s1,*s2,*s3,*s4;
	float *eig_retar_S,*eig_axis_S,*eig_dia_S; 
	float *real_A_P1,*img_A_P2,*real_A_P2,*img_A_P1;
	float *Inten,*mn,*I,*I_S,*MN,*h_part,*g_part,*hh,*gg;
	float *mask,*eig_retar,*eig_axis, *eig_dia,*delt,*delt2,*eig_axis2;
	fftwf_complex  *V11;
	fftwf_complex  *V12;
	fftwf_complex  *V21;
	fftwf_complex  *V22;
	fftwf_complex  *DD11;
	fftwf_complex  *DD22;
	fftwf_complex* angle_shift;
	fftwf_complex *ii,*jj,*jj2,*ii2,*II2,*II,*JJ,*JJ2;
	//-----------------------------------------------------------

	 WaitForSingleObject(ghMutex, INFINITE);
	 Param=ReadTestPara(base_path);
        ReleaseMutex(ghMutex);

	//-----------------------------------------------------------
}

推荐答案

创建多个线程没有任何问题,但必须仔细考虑一些要点:

- 如果您使用任何已分配的mem,请确保在最后一个线程完成执行之前不要删除它。

- 如果您同时访问相同的对象,则使用互斥锁(或其他互锁)保护它们对象)



在你的情况下,我认为你应该更多地关注编译器警告。用于创建线程的C函数_beginthread()需要__cdecl函数来调用,但如果使用__stdcall deafult调用约定编译代码,则堆栈将在线程末端损坏内存时清理两​​次。



在堆栈上分配如此多的可变内存也不是一个好主意...
There is nothing wrong in creating multiple threads, but some points have to be considered carefully:
- If you use any allocated mem, be sure to not remove it before the last thread have completed execution.
- If you have concurrent access to same objects protect them with a mutex (or other interlocking object)

In you case I think that you should have payed more attention to compiler warnings. The C function you use to create threads, _beginthread(), requires a __cdecl function to call, but if you compiled your code with __stdcall deafult calling convention the stack will be cleaned twice on thread end corrupting memory.

And it's also not avery good idea to allocate so much variable memory on the stack...


修复代码中的一个错误。如果你有线程,那么初始化它们包括最后一个。我可以看到,对于一个循环,你使用< tn - 1.这是非常可疑的。



删除重现问题所不需要的所有代码。在你的问题中,你有太多的东西,如无用的声明。如果您转储太多代码,因为没有人会花费更多的时间来阅读它,甚至更多的格式很差,你也不会得到很多答案。
Fix off by one errors in your code. If you have tn thread, then initialize them all including the last one. I can see that for one loop you use < tn - 1. This is very suspicious.

Remove all code not necessary to reproduce the problem. In your question, you have far too much stuff like useless declaration. You won't ger many answer if you dump too much code as nobody would take the effort to read it and even more with poor formatting.


这篇关于多功能的功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆