TBB用于不断变化的工作负载? [英] TBB for a workload that keeps changing?

查看:147
本文介绍了TBB用于不断变化的工作负载?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很抱歉,我似乎无法获得英特尔的TBB ,这看起来不错,支持,但我不能包装我的头如何使用它,因为我猜我不习惯在任务的并行性思考,而是看到它作为线程。

I'm sorry I don't seem to get intel's TBB it seems great & supported but I can't wrap my head around how to use it since I guess I'm not used to thinking of parallelism in terms of tasks but instead saw it as threads.

我当前的工作负载有一个作业,将工作发送到队列以继续处理(考虑递归,而不是调用自身,它将工作发送到队列)。我在Java中工作的方式是创建一个并发队列(非阻塞队列)和threadpoolexecutor工作队列/发送工作回它。但现在我想在c ++中做类似的事情,我发现TBB可以创建池,但它的方法是非常不同的(Java线程似乎只是保持工作,只要他们在队列中工作,但TBB似乎打破了任务下

My current workload has a job that sends work to a queue to keep processing(think of a recursion but instead of calling itself it sends work to a queue). The way I got this working in Java was to create a concurrent queue(non-blocking queue) and threadpoolexecutor that worked the queue/send work back to it. But now I'm trying to do something similar in c++, I found TBB can create pools but its approach is very different(Java threads seem to just keep working as long as their is work in the queue but TBB seems to break the task down at the beginning).

这里是一个简单的Java示例(我之前设置了我想要的线程数等):

Here's a simple Java example of what I do(before this I set how many threads I want,etc..):

static class DoWork implements Callable<Void> {
    // queue with contexts to process
    private Queue<int> contexts;

    DoWork(Context request) {
        contexts = new ArrayDeque<int>();
        contexts.add(request);
    }

    public Void call() {
        while(!contexts.isEmpty()) {
            //do work 
            contexts.add(new int(data)); //if needs to be send back to the queue to do more work
        }
    }
}

我确定它可以在TBB中做到这一点,但我不知道为什么,因为它似乎在我发送它的时候分裂我的工作。所以如果它有2个项目在队列中,它可能只启动2线程,但不会增长,因为更多的工作进来(即使我有8核心)。

I sure its possible to do this in TBB, but I'm just not sure how because it seems to break up my work at the time I send it. So if it there's 2 items in the queue it may only launch 2 threads but won't grow as more work comes in(even if I have 8 cores).

有人可以帮助我了解如何实现我的任务,也可以建议一个更好的方式来思考TBB来自使用Java的线程环境(我也没有效忠TBB ,所以如果有一些更容易/更好,那么我很高兴学习它。我只是不喜欢c ++ threadpool,因为它似乎没有积极开发)?

Can someone help me understand how to achieve my tasks and also maybe suggest a better way to think about TBB coming from using Java's threading environment(also I have no allegiance to TBB, so if there's something easier/better then I'm happy to learn it. I just don't like c++ threadpool because it doesn't seem actively developed)?

推荐答案

基于具有用于并行处理的项目队列的方法,其中每个线程仅从队列中弹出一个项目并继续(并且可能在一些队列的末尾添加一个新项目点)从根本上是错误的,因为它限制了应用程序的并行性。队列成为单个同步点,线程需要等待才能访问下一个要处理的项。在实践中,当任务(每个项目的处理作业)相当大并且花费不同的时间完成时,这种方法工作,允许队列较少竞争,而不是当大多数线程同时完成并到达队列时

The approach based on having a queue of items for parallel processing, where each thread just pops one item from the queue and proceeds (and possibly adds a new item to the end of the queue at some point) is fundamentally wrong since it limits parallelism of the application. The queue becomes a single point of synchronization, and threads need to wait in order to get access to the next item to process. In practice this approach works when tasks (each items' processing job) are quite large and take different times to complete, allowing queue to be less contended as opposed to when (most of the) threads finish at the same time and come to the queue for their next items to process.

如果您正在编写一段可重复使用的代码段,则无法保证任务足够大或大小不同时间来执行)。

If you're writing a somewhat reusable piece of code you can not guarantee that tasks are either large enough or that they vary in size (time to execute).

我假设你的应用程序的规模,这意味着你开始一些大量的项目(比线程数大于你的队列)并且线程做处理时,他们添加足够的任务到最后,这样就有足够的工作,为每个人,直到应用程序完成。

I assume that you application scales, which means that you start with some significant number of items (much larger than the number of threads) in your queue and while threads do the processing they add enough tasks to the end, so that there's enough job for everyone until application finishes.

如果是这样的情况,我宁愿建议你保存您的项目的两个线程安全向量(例如TBB的concurrent_vectors)以实现可互换性。你从一个向量(你的初始项目集合)开始,然后enque()一个任务(我认为它在TBB参考手册的第12章中描述),它在项目的初始向量上执行parallel_for。当第一个批处理被处理时,你将push_back新的项目到第二个并行向量,当你完成了第一个你enque()任务与parallel_for第二个向量,并开始推新的项目回到第一个。你可以通过三个向量而不是两个向量,逐渐在它们之间移动,同时还有足够的工作让所有线程保持繁忙,从而更好地尝试并重叠项目的并行处理。

If that's the case I would rather suggested that you kept two thread-safe vectors of your items (TBB's concurrent_vectors for instance) for interchangeability. You start with one vector (your initial set of items) and you enque() a task (I think it's described somewhere in chapter 12 of the TBB reference manual), which executes a parallel_for over the initial vector of items. While the first batch is being processed you would push_back the new items onto the second concurrent_vector and when you're done with the first one you enque() a task with a parallel_for over the second vector and start pushing new items back into the first one. You can try and overlap parallel processing of items better by having three vectors instead of two and gradually moving between them while there's still enough work for all thread to be kept busy.

这篇关于TBB用于不断变化的工作负载?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆