查找计划任务与拓扑排序的最小完成时间 [英] Finding Minimum Completion Time of Scheduled Tasks with Topological Sort

查看:199
本文介绍了查找计划任务与拓扑排序的最小完成时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  

假定有无限个工人的每一个可以的   完成一个任务,其中每一个需要一些时间。那里也是   其中一个任务无法完成,直到precedence限制   另一种是。什么是时间所需的每个任务的最小量   同时尊重precedence顺序完成?


  

作为一个例子,假设你有3个任务。

     

第一个任务花费的时间10个单位完成,第二个需要5   的时间单位来完成的,而第三花费的时间6个单位,以   完成了。

     

的约束为,第二任务不能被启动,直到第三   任务完成


  

鉴于此,可以得出这样的结论的最短时间所需的所有   同时尊重precedence是11要完成的任务。

     

这是因为你可以做任务1和3(这需要时间10和6   分别)同时,和任务3完成后,就可以   开始的任务2(需要5个单位时间)。因此,所有的任务将   最后时刻11。


看起来这可能与拓扑排序来解决,这将使你的顺序完成任务,以满足约束条件。

例如,在这个问题之后,拓扑排序的任务,你最终会与

 任务1(10个时间单位) - >任务3(6个时间单位) - >任务2(5个单位时间)
 


然而,一旦你获得这个顺序去的任务,你如何确定哪些是可以同时进行,哪些需要完成一个接一个?此外,你怎么计算所需的所有的最短时间完成?

解决方案

我假设的任务没有循环依赖。这样,任务和它们的依赖,可以重新psented作为的向无环图(简称为的 DAG 从此点),其中任务是顶点,和一个$ P $边缘ü - > v 意味着任务 U 任务之前,必须完成 v 可以开始了。

您的方法是正确的,利用拓扑排序,计算其中的任务可以完成的顺序,因为图形是一个DAG。我看到有2,你有几个主要问题。

  1. 在计算它都不可能同时完成的任务
  2. 计算的最短时间完成所有的任务

2号比较简单,所以我先解决这个问题。寻找最短时间完成所有的任务并不难,一旦你已经计算出的拓扑顺序。它本质上是寻找在DAG 最长的路径,你可能已经听说它在应用关键路径法。从本质上讲,来完成所有任务所需的最短时间是完成任务链中的持续时间最长所需要的时间。这似乎有悖常理乍一看,但这个想法是,最终完成了所有的任务,取决于我们需要多长时间才能完成的任何链的任务,因此,这取决于所需要的时间完成该占用时间最长的任务链。

下面是该算法(我想应该是正确的,但因为我没有做这个做了一点时间,我可能是错的,所以不要自行确认)的伪code:

  min_time_for_tasks(图表,拓扑排序):
    距离<  - 整型数组,其大小为顶点的数
    将所有条目访问阵列负无穷大
    maxDist<  -  0
    对于V IN拓扑顺序:
        如果距离[V] ==负无穷大:
            maxDist<  - 马克斯(maxDist,longest_path(图表,距离,V))
    返回maxDist

//计算从顶点v的最长路径
longest_path(格拉夫,距离,ⅴ):
    maxDist<  -  0
    对于所有边缘(V,W)传出从ν:
        //顶点瓦特还没有被访问过
        如果距离[W] ==负无穷大:
            longest_path(格拉夫,距离,w)的
        //从V得到所有顶点之间最长的路径到达
        maxDist<  - 马克斯(maxDist,距离[W])
    距离[V]<  -  maxDist +时间来完成任务v
    返回的距离[v]的
 

至于第1期,计算能同时完成的任务,这是稍微更棘手。我还没有真正这样做过,但我认为它可以使用一个名为卡恩的算法(它是第一个算法,拓扑排序在维基百科上,链接的这里)。我不知道你怎么想的,但我通常使用的是深度优先搜索和堆栈,然后弹出的顶点从堆栈中获取的顺序执行拓扑排序。从本质上讲,卡恩的算法是一个拓扑排序算法,并稍作修改,就应该能解决我们的第一个问题。我没有这样做过,所以请再次验证这一点。伪code与解释中的注释:

  //这是修改后的卡恩的算法。
//同样,图形被假定为一个DAG,我们首先计算
//到每个顶点的传入的边的数目,因为卡恩的
//算法依赖于这些信息。
//这个函数将返回套任务的数组,
//可以在同一时间进行。
//因此,所有在该组中returnArray任务[0]可以在相同的做
//时间,在该组中returnArray [1]的所有任务可以在完成
//同时,等说明​​将后可用
//伪code
compute_simultaneous_tasks(图):
    为<  - 整数数组,所有条目初始化为0
    //此功能定义如下
    compute_incoming_edges(图表,进入)
    returnArray<  - 空数组
    VSET<  - 集图中的所有顶点
    而VSET不为空:
        //检索,没有入边的所有顶点
        //这意味着他们的prerequisite的任务已经完成,
        //我们可以执行特定任务
        headVertices<  - 所有顶点'v`在V设定,使得进入[V]为0

        //删除所有顶点从VSET headVertices
        VSET<  -  VSet.remove(headVertices)

        //我们将删除顶点headVertices,
        //所以传出从他们的边也将被删除。
        每个顶点v在headVertices:
            每个顶点W¯¯传出从V:
                进入[W]<  - 为[W]  -  1

        //在headVertices所有顶点可以在开始
        //同时,由于他们的prerequisite任务已
        //完成
        returnArray.append(headVertices)

    返回returnArray

//计算边缘通向每个顶点的数
//所以进[X]给出边缘通向顶点的数x
//从一些其他顶点
compute_incoming_edges(图,成):
    每个顶点v在图:
        每个顶点W¯¯传出从V:
            成[瓦特]其中;  - 成[瓦特] +1
 

因此​​, compute_simultaneous_tasks 函数会返回给套数组。每一组包含顶点/任务,可在同一时间进行。要明白为什么是这样的话,对 compute_simultaneous_tasks主回路内的主要思想,我们提取它们没有入边的所有顶点。这相当于提取的所有任务,没有prerequisites。因此,我们可以看到,它是安全的同时执行这组任务,因为他们没有prereqs,而且肯定不会依赖对方!然后,我们删除其出边,来模拟其完成。然后我们进入到循环等的下一次迭代。

所以我们看到,在,而每次循环,我们正在做什么,我们刚才所描述的,因此它是安全的,执行中<$ C $的所有任务C> returnArray [0] 在同一时间,所有的任务在 returnArray [1] 同时但是只有在 returnArray [0] 完成后,在所有任务 returnArray [2] 之后,那些<$所有任务后, C $ C> returnArray [1] 完成,等等。

,代替除去一个顶点与0入边在一个步骤所以在卡恩的算法的这种修改中,我们取出所有顶点0入边在一个步骤中,为了获得所有其可以在simulateneously完成的任务波,与 returnArray [0] 是波0, returnArray [1] 为波1,等取请注意,我们只移除所有顶点出边一波的我们已经提取的所有顶点0入边。

虽然我不能给出一个正式的证明,我确实认为,如果一个人同时完成所有的一挥手,一会就能完成我们上面计算,由于是如何依赖最小时间的所有任务被考虑到了。

希望帮助,和新年快乐=)

Assume that there are an unlimited number of workers each of which can complete one task, each of which takes some time. There are also precedence constraints where one task cannot be completed until another is. What is the minimum amount of time needed for each task to be completed while respecting the precedence order?


As an example, lets say you have 3 tasks.

The first task takes 10 units of time to complete, the second takes 5 units of time to complete, and the third takes 6 units of time to complete.

The constraint is that the 2nd task cannot be started until the 3rd task is finished.


Given this, you can conclude that the minimum time needed for all the tasks to be completed while respecting the precedence is 11.

This is because you can do tasks 1 and 3 (which take times 10 and 6 respectively) simultaneously, and after task 3 is finished, you can begin on task 2 (which takes 5 units of time). Thus, all tasks will end at time 11.


It seems like this could be solved with a topological sort, which would give you the order in which to complete tasks in order to satisfy the constraints.

For example, in the problem, after topologically sorting the tasks you would end up with

Task 1 (10 Units of time) -> Task 3 (6 Units of time) -> Task 2 (5 Units of time)


However, once you obtain this order in which go about the tasks, how do you determine which ones can be done simultaneously and which ones need to be done one after the other? Further, how do you compute the minimum time needed for all of them to finish?

解决方案

I am assuming that the tasks do not have cyclic dependencies. As such, the tasks and their dependencies can be represented as a directed acyclic graph (abbreviated as dag from this point), where tasks are vertices, and an edge u -> v means that task u must be completed before task v can begin.

Your method is right in using topological sort to compute the order in which the tasks can be done, since the graph is a dag. I see that there are 2 main questions that you have.

  1. Computing the tasks which can possibly be done simultaneously
  2. Computing the minimum amount of time to complete all the tasks

Number 2 is simpler, so I shall address it first. Finding the minimum time to complete all the tasks is not difficult once you have computed the topological order. It is essentially finding the longest path in dag, you may have heard its application in the Critical path method. Essentially, the minimum amount of time needed to complete all the tasks is the time needed to complete the chain of tasks with the longest duration. This may seem counterintuitive at first glance, but the idea is that eventual completion of all tasks is dependent on how long we need to complete any chain of tasks, and hence, this is dependent on the time needed to complete the chain of tasks that take up the longest time.

Here is a pseudocode of the algorithm (I think it should be correct, but because I haven't been doing this for a bit of time, I may be wrong, so do verify yourself):

min_time_for_tasks(Graph, topological order):
    distance <- integer array whose size is number of vertices
    set all entries in visited array to negative infinity
    maxDist <- 0
    for v in topological order:
        if distance[v] == negative infinity:
            maxDist <- max(maxDist, longest_path(Graph, distance, v))
    return maxDist

// computes the longest path from vertex v
longest_path(Graph, distance, v):
    maxDist <- 0
    for all edges (v,w) outgoing from v:
        // vertex w has not been visited
        if distance[w] == negative infinity:
            longest_path(Graph, distance, w)
        // get the longest path among all vertices reachable from v
        maxDist <- max(maxDist, distance[w])
    distance[v] <- maxDist + time to complete task v
    return distance[v]

As for Number 1, computing the tasks which can be done simultaneously, this is slightly more tricky. I haven't actually done this before, but I do think that it can be done using a topological sort algorithm called Kahn's algorithm (it's the first algorithm for Topological Sort on wikipedia, link is here). I don't know about you, but normally I perform topological sort using a depth first search and a stack, and then popping vertices from the stack to obtain the order. Essentially, Kahn's algorithm is a topological sort algorithm, and with some slight modifications, it should be able to solve our first problem. I haven't done this before, so please verify this again. Pseudocode with explanation in the comments:

// This is a modified kahn's algorithm.
// Again, the graph is assumed to be a dag, and we first compute
// the number of incoming edges to every vertex, since Kahn's
// algorithm is dependent on this information.
// This function will return an array of sets of tasks which
// can be done at the same time.
// So all tasks in the set in returnArray[0] can be done at the same
// time, all tasks in the set in returnArray[1] can be done at the
// same time, etc. Explanation will be available after the
// pseudocode
compute_simultaneous_tasks(Graph):
    into <- integer array, all entries initialized to 0
    // this function is defined below
    compute_incoming_edges(Graph, into)
    returnArray <- empty array
    VSet <- Set of all vertices in the graph
    while VSet is not empty:
        // retrieve all vertices with no incoming edges
        // this means their prerequisite tasks have been completed,
        // and we can execute that particular task
        headVertices <- all vertices `v` in VSet such that into[v] is 0

        // remove all the vertices in headVertices from VSet
        VSet <- VSet.remove(headVertices)

        // we are going to remove the vertices in headVertices,
        // so the edges outgoing from them will be removed as well.
        for each vertex v in headVertices:
            for each vertex w outgoing from v:
                into[w] <- into[w] - 1

        // all the vertices in headVertices can be started at the
        // same time, since their prerequisite tasks have been
        // completed
        returnArray.append(headVertices)

    return returnArray

// computes the number of edges leading into each vertex
// so into[x] gives the number of edges leading into vertex x
// from some other vertex
compute_incoming_edges(Graph, into):
    for each vertex v in Graph:
        for each vertex w outgoing from v:
            into[w] <- into[w] + 1

So the compute_simultaneous_tasks function will give return an array of sets. Each set contains vertices/tasks which can be done at the same time. To get the main idea of why this is the case, inside the main loop of compute_simultaneous_tasks, we extract all vertices which have no incoming edges. This is equivalent to extracting all tasks with no prerequisites. Hence, we can see that it is safe to execute this set of tasks simultaneously, since they have no prereqs, and are certainly not dependent on each other! Then, we remove their outgoing edges, to simulate their completion. Then we move on to the next iteration of the loop, etc.

So we see that at each iteration of the while loop, we are doing what we just described, and hence it is safe to execute all tasks in returnArray[0] at the same time, all tasks at returnArray[1] at the same time but only after all tasks in returnArray[0] are done, all tasks in returnArray[2] after those in returnArray[1] are done, and so on.

So in this modification of Kahn's algorithm, instead of removing one vertex with 0 incoming edges at one step, we remove all vertices with 0 incoming edges at one step, in order to obtain all the tasks which can be done simulateneously in a "wave", with returnArray[0] being wave 0, returnArray[1] being wave 1, etc. Take note that we only remove outgoing edges from all vertices in a wave after we have extracted all the vertices with 0 incoming edges.

While I am not able to give a formal proof, I do believe that if one was to complete all the a wave simultaneously, one will be able to complete all the tasks in the minimum time we computed above, due to how dependencies are taken into account of.

Hope that helps, and Happy New Year =)

这篇关于查找计划任务与拓扑排序的最小完成时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆