当增加线程数时,多线程文件IO程序的行为异常 [英] Multithreading File IO program behaves unpredictably when number of thread is increased

查看:95
本文介绍了当增加线程数时,多线程文件IO程序的行为异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试通过写入各种块大小和不同数量的线程来创建1Mb(1048576Byte)文件.当int NUM_THREADS = 2int NUM_THREADS = 1时,创建的文件大小与给定的大小相同,即10MB.

Trying to create 1Mb(1048576Byte) file by writing in various chunk sizes and a different number of threads. When int NUM_THREADS = 2 or int NUM_THREADS = 1 then created file size is same as given i.e. 10MB .

但是,当我将线程数增加到4时,创建的文件大小约为400MB.为什么会出现这种异常?

However when I increase thread count to 4, The created file size is around 400MB; Why this anomaly?

#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"

using namespace std;
pthread_mutex_t mutexsum;
struct workDetails {
    int threadcount;
    int chunkSize;
    char *data;
};

void *SPWork(void *threadarg) {
    struct workDetails *thisWork;
    thisWork = (struct workDetails *) threadarg;
    int threadcount = thisWork->threadcount;
    int chunkSize = thisWork->chunkSize;
    char *data = thisWork->data;
    long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
    FILE *f = fopen(fileToWrite, "a+");
    for (long i = 0; i < noOfWrites; ++i) {
        pthread_mutex_lock(&mutexsum);
        fprintf(f, "%s", data);
        fflush (f);
        pthread_mutex_unlock(&mutexsum);
    }
    fclose(f);
    pthread_exit((void *) NULL);
}

int main(int argc, char *argv[]) {
    int blocksize[] = {1024};
    int NUM_THREADS = 2;
    for (int BLOCKSIZE: blocksize) {
        char *data = new char[BLOCKSIZE];
        fill_n(data, BLOCKSIZE, 'x');

        pthread_t thread[NUM_THREADS];
        workDetails detail[NUM_THREADS];
        pthread_attr_t attr;
        int rc;
        long threadNo;
        void *status;

        /* Initialize and set thread detached attribute */
        pthread_mutex_init(&mutexsum, NULL);
        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            detail[threadNo].threadcount = NUM_THREADS;
            detail[threadNo].chunkSize = BLOCKSIZE;
            detail[threadNo].data = data;
            rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
            if (rc) exit(-1);
        }
        pthread_attr_destroy(&attr);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            rc = pthread_join(thread[threadNo], &status);
            if (rc) exit(-1);
        }
        pthread_mutex_destroy(&mutexsum);
        delete[] data;
    }
    pthread_exit(NULL);
}

-- 1)这是一项基准测试任务,因此请按照他们的要求进行. 2)long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));基本上计算每个线程应写入多少次才能获得10MB的合并大小. 4)我试图将互斥锁放在不同的位置.所有结果都相同

N.B. - 1)It's a benchmarking task, so doing as they asked in requirement. 2) long noOfWrites = (TenGBtoByte / (threadcount * chunkSize)); basically computing how many times each thread should write to get the combined size of 10MB. 4)I tried to put Mutex lock at various position . All yeild in same result

也欢迎提出有关程序其他更改的建议

推荐答案

您正在像这样分配和初始化数据数组:

You are allocating and initializing your data array like this:

char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');

然后使用fprintf将其写入文件:

Then you are writing it to file using fprintf:

fprintf(f, "%s", data);

函数fprintf期望data是一个以空字符结尾的字符串.这已经是未定义的行为.如果此方法在线程数量较少的情况下起作用,那是因为内存块之后的内存恰好包含零字节.

Function fprintf expects data to be a null-terminated string. This is an undefined behavior already. If this worked with low number of threads, it is because memory after than memory chunk happen to contain zero byte.

除此之外,程序中的互斥没有任何作用,可以将其删除.文件锁定也是多余的,因此您可以使用fwrite_unlockedfflush_unlocked来写入数据,因为每个线程都使用单独的FILE对象.基本上,程序中的所有同步都在内核中执行,而不是在用户空间中执行.

Other than that, mutex in your program serves no purpose and can be removed. File locking is also redundant, so you can use fwrite_unlocked and fflush_unlocked to write your data since every thread uses separate FILE object. Essentially all synchronization in your program is performed in the kernel, not in userspace.

即使在删除互斥锁并使用_unlocked函数之后,您的程序也可以可靠地创建1 MB文件,而不管线程数如何.因此,无效的文件写入似乎是您遇到的唯一问题.

Even after removing mutex and using _unlocked functions your program reliably creates 1 MB files regardless of number of threads. So invalid file writing seems to be the only issue you have.

这篇关于当增加线程数时,多线程文件IO程序的行为异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆