多个线程在循环中运行时如何影响索引 [英] How index is affected when multiple threads are running in a loop

查看:49
本文介绍了多个线程在循环中运行时如何影响索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图编写一个运行5个线程并相应打印其索引的程序.

I was trying to write a program which runs 5 threads and prints its index accordingly.

下面是代码:

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>

int nthreads=5;

void *busy(void* c) {

    int my_busy = *(int *) c;

    printf("Hello World with thread index %d\n", my_busy);

    return NULL;
}    

int main()
{
    pthread_t t1[nthreads];
    void* result;

    for(int i=0; i<nthreads; i++)
    {
        pthread_create(&t1[i], NULL, busy, &i);
    }

    for(int i=0; i<nthreads; i++)
    {
        pthread_join(t1[i], &result);
    }
    return 0;
}

获得的输出:

Hello World with thread index 1
Hello World with thread index 4
Hello World with thread index 2
Hello World with thread index 0
Hello World with thread index 0

尽管所有5个线程都运行,为什么相应的索引未正确输出?为什么我倾向于放宽某些索引而使其他索引两次呢?例如,在这种情况下,我输了3,两次输出了0.尽管在一个循环中将 pthread_join pthread_create 一起使用可以解决此问题,但它不会安排所有线程并行运行.在这种情况下应该怎么做才能打印所有索引?

Though all 5 threads run why the corresponding indexes are not properly outputted? Why do I tend to loose some indexes and get others twice? For example in this case I have lost 3 and got 0 outputted twice. Though using pthread_join along with pthread_create in one loop solves the problem it doesn't schedule all threads to run in parallel. What should be done in this case to get all indexes printed?

推荐答案

虽然所有5个线程都运行,但为什么相应的索引未正确输出?

Though all 5 threads run why the corresponding indexes are not properly outputted?

您向每个线程传递一个指向变量的指针,并在线程函数访问它的同时修改该变量.您为什么期望线程函数看到任何特定值?它们同时运行.在某些体系结构上,如果一个线程正在读取值而另一个线程正在修改值,则线程可能会看到一个完全不可能的乱码.

You pass a pointer to a variable to each thread, and modify that variable at the same time the thread functions access it. Why would you expect the thread functions to see any particular value? They run concurrently. It is possible for the threads to see a completely impossible, garbled value, if one thread is reading the value while another is modifying it, on some architectures.

例如,在这种情况下,我输了3,两次输出了0.

For example in this case I have lost 3 and got 0 outputted twice.

尽管例如由在创建每个线程函数后,GCC会增加线程函数访问的变量,在某些体系结构上,由于未使用任何障碍或同步,因此线程函数观察到的值可能会过时".

Although the machine code generated by e.g. GCC increments the variable accessed by the thread functions after creating each thread function, the value observed by the thread functions may be "old" on some architectures, because no barriers or synchronization is used.

这是否发生在您的特定计算机上(没有明显的障碍或同步),取决于哪个内存排序模型您的机器实现.

Whether this occurs on your particular machine (without explicit barriers or synchronization), depends on which memory ordering model your machine implements.

例如,在x86-64(又名AMD64; 64位Intel/AMD架构)上,所有读取和写入均按顺序进行,除了可以在加载后对存储进行排序.这意味着如果最初说 i = 0; ,而线程A做到了 i = 1; ,则线程B仍然可以看到 i == 0 即使在线程A修改了变量之后.

For example, on x86-64 (aka AMD64; the 64-bit Intel/AMD architecture), all reads and writes are observed in order, except that stores may be ordered after loads. This means that if initially say i = 0;, and thread A does i = 1;, thread B can still see i == 0 even after thread A modified the variable.

请注意,添加障碍(例如,在使用大多数C编译器时,使用 提供的x86/AMD64内在函数进行 _mm_fence())确保每个线程看到唯一的值,因为相对于调用 pthread_create()的真实时刻,每个线程的启动可能会延迟.他们所确保的只是至多一个线程看到零值.两个线程可能会看到值1,三个值会看到2,依此类推;等等.甚至所有线程都有可能看到值5.

Note that adding the barriers (e.g. _mm_fence() using x86/AMD64 intrinsics provided by <immintrin.h> when using most C compilers) is not enough to ensure each thread sees an unique value, because the start of each thread can be delayed with respect to the real world moment when pthread_create() was called. All they ensure is that at most one thread sees the zero value. Two threads may see value 1, three value 2, and so on; it is even possible for all threads to see value 5.

在这种情况下,应该如何打印所有索引?

What should be done in this case to get all indexes printed?

最简单的选择是提供要打印的索引作为值,而不是作为变量的指针.在busy()中,使用

The simplest option is to provide the index to be printed as a value, rather than as a pointer to a variable. In busy(), use

my_busy = (int)(intptr_t)c;

和main()中的

pthread_create(&t1[i], NULL, busy, (void *)(intptr_t)i);

intptr_t 类型是一种能够保存指针的带符号整数类型,并在< stdint.h> 中定义(通常包含在< inttypes.h> 代替.)

The intptr_t type is a signed integer type capable of holding a pointer, and is defined in <stdint.h> (usually included by including <inttypes.h> instead).

(由于问题被标记为的问题,我可能应该指出,在 Linux 中,在所有体系结构上,您都可以使用 long 代替 intptr_t,使用 unsigned long 代替 uintptr_t . long unsigned long 中都没有陷阱表示,并且所有可能的 long / unsigned long 值可以转换为唯一的 void * ,反之亦然;这样可以保证往返正常工作.内核syscall接口要求这样做,因此在将来也可以.)

(Since the question is tagged linux, I probably should point out that in Linux, on all architectures, you can use long instead of intptr_t, and unsigned long instead of uintptr_t. There are no trap representations in either long or unsigned long, and every possible long/unsigned long value can be converted to an unique void *, and vice versa; a round-trip is guaranteed to work correctly. The kernel syscall interface requires that, so it is extremely unlikely to change in the future either.)

如果需要将指针传递给 i ,但是希望每个线程看到唯一的值,则需要使用某种同步.

If you need to pass the pointer to i, but wish each thread to see an unique value, you need to use some sort of synchronization.

最简单的同步方法是使用信号量.您可以将其设置为全局,但是使用一种结构来描述工作参数,并传递该结构的指针(即使所有工作线程都使用相同的指针)也更可靠:

The simplest synchronized approach would be to use a semaphore. You could make it global, but using a structure to describe the work parameters, and passing the pointer of the structure (even if same one is used for all worker threads) is more robust:

#include <stdlib.h>
#include <pthread.h>
#include <semaphore.h>
#include <string.h>
#include <stdio.h>

#define  NTHREADS  5

struct work {
    int     i;
    sem_t   s;
};

void *worker(void *data)
{
    struct work *const  w = data;
    int                 i;

    /* Obtain a copy of the value. */
    i = w->i;

    /* Let others know we have copied the value. */
    sem_post(&w->s);

    /* Do the work. */
    printf("i == %d\n", i);
    fflush(stdout);

    return NULL;
}    

int main()
{
    pthread_t    thread[NTHREADS];
    struct work  w;
    int          rc, i;

    /* Initialize the semaphore. */
    sem_init(&w.s, 0, 0);

    /* Create the threads. */
    for (i = 0; i < NTHREADS; i++) {

        /* Create the thread. */
        w.i = i;
        rc = pthread_create(&thread[i], NULL, worker, &w);
        if (rc) {
            fprintf(stderr, "Failed to create thread %d: %s.\n", i, strerror(rc));
            exit(EXIT_FAILURE);
        }

        /* Wait for the thread function to grab its copy. */
        sem_wait(&w.s);
    }

    /* Reap the threads. */
    for (i = 0; i < NTHREADS; i++) {
        pthread_join(thread[i], NULL);
    }

    /* Done. */
    return EXIT_SUCCESS;
}

由于主线程(修改每个工作线程看到的值的线程)参与了同步,因此每个工作函数在创建下一个线程之前先读取该值,输出将始终按 i .

Because the main thread, the thread that modifies the value seen by each worker thread, participates in the synchronization, so that each worker function reads the value before the next thread is created, the output will always be in increasing order of i.

一种更好的方法是创建一个工作池,其中主线程定义了要由线程共同完成的工作,而线程功能仅获取下一个要完成的工作块,以任何顺序:

A much better approach is to create a work pool, where the main thread defines the work to be done collectively by the threads, and the thread functions simply obtain the next chunk of work to be done, in whatever order:

#define  _POSIX_C_SOURCE  200809L
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <limits.h>
#include <stdio.h>
#include <errno.h>

#define  NTHREADS  5
#define  LOOPS     3

struct work {
    pthread_mutex_t  lock;
    int              i;
};

void *worker(void *data)
{
    struct work *const  w = data;
    int                 n, i;

    for (n = 0; n < LOOPS; n++) {

        /* Grab next piece of work. */
        pthread_mutex_lock(&w->lock);
        i = w->i;
        w->i++;
        pthread_mutex_unlock(&w->lock);

        /* Display the work */
        printf("i == %d, n == %d\n", i, n);
        fflush(stdout);
    }

    return NULL;
}

int main(void)
{
    pthread_attr_t  attrs;
    pthread_t       thread[NTHREADS];
    struct work     w;
    int             i, rc;

    /* Create the work set. */
    pthread_mutex_init(&w.lock, NULL);
    w.i = 0;

    /* Thread workers don't need a lot of stack. */
    pthread_attr_init(&attrs);
    pthread_attr_setstacksize(&attrs, 2 * PTHREAD_STACK_MIN);

    /* Create the threads. */
    for (i = 0; i < NTHREADS; i++) {
        rc = pthread_create(thread + i, &attrs, worker, &w);
        if (rc != 0) {
            fprintf(stderr, "Error creating thread %d of %d: %s.\n", i + 1, NTHREADS, strerror(rc));
            exit(EXIT_FAILURE);
        }
    }

    /* The thread attribute set is no longer needed. */
    pthread_attr_destroy(&attrs);

    /* Reap the threads. */
    for (i = 0; i < NTHREADS; i++) {
        pthread_join(thread[i], NULL);
    }

    /* All done. */
    return EXIT_SUCCESS;
}

如果编译并运行此最后一个示例,您会注意到输出可能是奇数顺序的,但是每个 i 都是唯一的,每个 n = 0 直到 n = LOOPS-1 发生了 NTHREADS 次.

If you compile and run this last example, you'll notice that the output may be in odd order, but each i is unique, and each n = 0 through n = LOOPS-1 occurs exactly NTHREADS times.

这篇关于多个线程在循环中运行时如何影响索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆