如何计算多线程进程的整体计算时间 [英] How to calculate overall computation time for a multi-threaded process

查看:170
本文介绍了如何计算多线程进程的整体计算时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组任务,我们称之为T[],其中每个任务T[i]需要一定的时间t(T[i])进行处理. X线程正在并行处理任务(这并不意味着多个线程正在共同处理一个任务,而是多个线程正在处理多个任务,每个线程执行一个任务,然后执行下一个任务, 等等).

I have a set of tasks, let's call it T[], where each task T[i] needs a certain amount of time t(T[i]) to be processed. The tasks are being processed in parallel by X threads (that is not to mean that multiple threads are co-working on a single task, but that multiple tasks are being processed by multiple threads, each thread doing one task, then the next, etc).

现在,我要计算处理所有任务所需的总时间.只要size(T[]) <= X当然很容易(即任务数小于或等于线程数),在这种情况下,总时间等于最慢任务的时间.

Now I want to calculate the expected overall time it will take to process all tasks. It's easy as long as size(T[]) <= X of course (i.e. the number of tasks is less than or equal to the number of threads), in this case the overall time equals the time of the slowest task.

但是对于X < size(T[])情况我很失落(即,我的线程数少于任务数).如何以一种优雅的方式计算出来?

But I'm quite lost for the case X < size(T[]) (i.e. I have fewer threads than tasks). How would one calculate that in an elegant way?

edit:根据评论员的要求,我们可以假定任务队列按运行时间最长的任务排在最前面,运行时间最短的任务排在最后.此外,我们可以假设任务之间没有暂停,也可以忽略OS调度程序在做什么.

edit: As asked by a commentator, we can assume the tasks queue is ordered by longest-running task first, shortest-running task last. Also, we can assume there is no pauses between tasks, and we can also neglect what the OS scheduler is doing.

推荐答案

我假定任务按照提供的顺序进行调度,并且每个任务都进入第一个空闲线程.如果这些假设正确,则没有有意义的非确定性-任务可以转到任何空闲的线程(如果有多个线程),但这对总运行时间没有影响.

I assume that the tasks are scheduled in the order that they're provided, and that each task goes to the first thread that's free. There's no meaningful non-determinism if these assumptions are correct -- a task may go to any of the threads that are free (if there's more than one), but this has no effect on the total running time.

在这种情况下,我们可以使用大小为X的最小堆(其中X是线程数)模拟此情况,堆中的值表示其中一个线程的空闲时间.对于每个任务,我们将最早的线程从堆中弹出,然后在完成新任务时将其推回.

In that case, we can simulate this using a min-heap of size X (where X is the number of threads), with the values in the heap representing the free time of one of the threads. For each task, we pop the earliest-free thread off the heap, and then push it back with the time it'll finish this new task.

安排完所有任务后,我们可以在堆中取最大值,这是所有任务完成的时间.

After we've scheduled all tasks, we can take the largest value in the heap, which will be the time at which all tasks are completed.

这是Python中相对较少的代码:

This is relatively little code in Python:

import heapq

def compute_time(tasks, X):
    threads = [0] * X
    for t in tasks:
        heapq.heappush(threads, heapq.heappop(threads) + t)
    return max(threads)

print compute_time([3, 2, 1], 2)
print compute_time([5, 4, 3, 3, 2, 1, 1], 3)

或使用Java:

import java.util.*;

class Threads {
    public static void main(String args[]) {
        int totalTime1 = computeTotalTime(Arrays.asList(3, 2, 1), 2);
        System.out.println("totalTime1: " + totalTime1);

        int totalTime2 = computeTotalTime(Arrays.asList(5, 4, 3, 3, 2, 1, 1), 3);
        System.out.println("totalTime2: " + totalTime2);
    }

    static int computeTotalTime(List<Integer> task, int threads) {
        PriorityQueue<Integer> q = new PriorityQueue<Integer>();
        for (int i = 0; i < threads; i++) q.add(0);
        for (int t : task) q.add(q.poll() + t);
        int max = 0;
        while(!q.isEmpty()) max = q.poll();
        return max;
    }
}

这篇关于如何计算多线程进程的整体计算时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆