信号量的最大值? [英] Max values of semaphore?

查看:428
本文介绍了信号量的最大值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,有一个1000次循环.使其快速,有效且不会导致死锁的最大值是多少?

For example, there is a 1000 times loop. What's the max value to make it fast, effective, and not lead to a deadlock?

let group = DispatchGroup()
let queue = DispatchQueue(label: "com.num.loop", attributes: .concurrent)
let semaphore = DispatchSemaphore(value: 4)
for i in 1...1000 {
    semaphore.wait()
    group.enter()
    queue.async(group: group, execute: {
        doWork(i)                                    
        group.leave()
        semaphore.signal()
    })            
}

group.notify(queue: DispatchQueue.main) {
    // go on...
}

推荐答案

一些观察结果:

  1. 您永远都不想超过每个QoS的GCD工作线程的最大数量.如果超出此限制,则可能会遇到应用程序被阻止的情况.我上次检查时,此限制为64个线程.

  1. You never want to exceed the maximum number of GCD worker threads per QoS. If you exceed this, you may experience blocking within your app. The last I checked, this limit was 64 threads.

话虽如此,超过设备的内核数量通常没有什么好处.

That having been said, there’s generally little benefit in exceeding the number of cores on your device.

通常,我们会使用 concurrentPerform ,它已针对设备进行了自动优化.它还消除了对任何信号量或组的需求,这通常可以减少混乱的代码:

Often, we would let GCD figure out the maximum number of concurrent threads for us using concurrentPerform, which is automatically optimized for the device. It also eliminates the need for any semaphores or groups, often leading to less cluttered code:

DispatchQueue.global().async {
    DispatchQueue.concurrentPerform(iterations: 1000) { i in
        doWork(i)                                    
    }

    DispatchQueue.main.async {
        // go on...
    }
}

concurrentPerform将并行运行1,000次迭代,但是将并发线程数限制为适合您的设备的水平,从而消除了使用信号灯的需求.但是concurrentPerform本身是同步的,直到所有迭代完成后才继续进行,从而消除了对调度组的需要.因此,将整个concurrentPerform分派到某个后台队列,完成后,只需执行您的完成代码"(或者,在您的情况下,将该代码分派回主队列即可).

The concurrentPerform will run the 1,000 iterations in parallel, but limiting the number of concurrent threads to a level appropriate for your device, eliminating the need for the semaphore. But concurrentPerform is, itself, synchronous, not proceeding until all iterations are done, eliminating the need for the dispatch group. So, dispatch the whole concurrentPerform to some background queue, and when it is done, just perform your "completion code" (or, in your case, dispatch that code back to the main queue).

尽管我在上面为concurrentPerform辩护,但只有在doWork同步执行其任务时(例如某些计算操作),该方法才有效.如果它本身是异步的,那么我们就不得不回到这种信号/组技术. (或者,也许更好的方法是,使用异步 Operation 子类,并在队列中添加合理的 maxConcurrentOperationCount 或合并

While I’ve argued for concurrentPerform above, that only works if doWork is performing its task synchronously (e.g. some compute operation). If it is initiating something that is, itself, asynchronous, then we have to fall back to this semaphore/group technique. (Or, perhaps better, use asynchronous Operation subclasses with a queue with reasonable maxConcurrentOperationCount or Combine flatMap(maxPublishers:_:) with reasonable limit on the count).

在这种情况下,关于合理的阈值,没有神奇的数字.您只需要执行一些经验测试,即可在内核数量与应用程序中可能发生的其他事情之间找到合理的平衡.例如,对于网络请求,我们经常使用4或6作为最大计数,不仅考虑超过该数目会减少收益,而且还考虑到如果成千上万的用户恰巧提交了太多并发事件,这会对我们的服务器造成影响同时请求.

Regarding reasonable threshold value in this case, there’s no magical number. You just have to perform some empirical tests, to find reasonable balance between number of cores and what else might be going on within your app. For example, for network requests, we often use 4 or 6 as a maximum count, not only considering the diminished benefit in exceeding that count, but also the implications of the impact on our server if thousands of users happened to be submitting too many concurrent requests at the same time.

就使其快速运行"而言,应允许并行运行多少次迭代"只是决策过程的一部分.更为关键的问题很快就变成了确保doWork做足够的工作来证明并发模式引入的适度开销.

In terms of "making it fast", the choice of "how many iterations should be allowed to run concurrently" is only part of the decision-making process. The more critical issue quickly becomes ensuring that doWork does enough work to justify the modest overhead introduced by the concurrent pattern.

例如,如果处理1,000×1,000像素的图像,则可以执行1,000,000次迭代,每个迭代处理一个像素.但是,如果这样做,您可能会发现它实际上比非并行再现慢.相反,您可能有1,000次迭代,每个迭代处理1,000个像素.或者您可能有100次迭代,每个迭代处理10,000个像素.这项技术称为跨步",通常需要进行一些实证研究,才能在每次执行的迭代次数和每次迭代完成的工作量之间找到合适的平衡. (顺便说一下,这种跨步模式通常还可以防止缓存晃动,如果多个线程争用相邻的内存地址,则可能会出现这种情况.)

For example, if processing an image that is 1,000×1,000 pixels, you could perform 1,000,000 iterations, each processing one pixel. But if you do that, you might find that it is actually slower than your non-concurrent rendition. Instead, you might have 1,000 iterations, each iteration processing 1,000 pixels. Or you might have 100 iterations, each processing 10,000 pixels. This technique, called "striding", often requires a little empirical research to find the right balance between how many iterations one will perform and how much work is done on each. (And, by the way, often this striding pattern can also prevent cache sloshing, a scenario that can arise if multiple threads contend for adjacent memory addresses.)

与上一点有关,我们经常希望这些各种线程同步对共享资源的访问(以保持线程安全).同步可能会在这些线程之间引起争用.因此,您将需要考虑如何以及何时进行此同步.

Related to the prior point, we often want these various threads to synchronize their access to shared resources (to keep it thread-safe). That synchronization can introduce contention between these threads. So you will want to think about how and when you do this synchronization.

例如,您可能不让每个迭代在一个本地变量中(不需要同步)更新一个本地变量(而不需要在doWork中进行多个同步),并且仅在完成本地计算后才对共享资源执行同步更新.很难抽象地回答这个问题,因为它很大程度上取决于doWork的操作,但是很容易影响整体性能.

For example, rather than having multiple synchronizations within doWork, you might have each iteration update a local variable (where no synchronization is needed) and perform the synchronized update to the shared resource only when the local calculations are done. It is hard to answer this question in the abstract, as it will depend largely upon what doWork is doing, but it can easily impact the overall performance.

这篇关于信号量的最大值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆