是否可以一起使用线程并发和并行? [英] Is it possible to use thread-concurrency and parallelism together?

查看:135
本文介绍了是否可以一起使用线程并发和并行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的一个项目,那种内容聚合器我想引入并发和如果可能的并行性。首先,这似乎是无意义的,因为并发和并行采用不同的方法。 (并发通过线程引入立即并发,其中并行性提供了潜力)。



为了更好地解释我的问题,让我总结一下我的问题集。 b
$ b

由于我的项目是一个内容聚合器(聚合feed,podcast和类似的东西),它基本上从web读取数据,解析它们以返回有意义的数据。



现在我采取了一个非常简单的顺序方法。让我们说,我们有一些数量的饲料我们必须解析。

  foreach(饲料中的饲料)
{
read_from_web(feed)
parse(feed)
}

所以使用顺序方法时间解析所有feed和process他们很大程度上取决于解析器代码,但需要从web获取xml源所需的时间。我们都知道,由于网络状况和类似问题,可能需要可变的时间来获取源代码。



为了加快代码速度,我可以采用工作线程的方法,这将引入立即并发;





因此,定义的工作线程数可以采用feed&同时解析(这将确保加快整个过程 - 因为我们将看到等待数据在网络上的影响较小)。



这可以直到这一点我的目标受众的项目大多是运行多核心cpus - 因为他们是玩家的事实。



我也想利用这些核心在处理内容时开始阅读潜在的并行性 http://oreilly.com/catalog/0790145310262。我还没有完成阅读它,但不知道这是否已经讨论,但我非常痴迷于这一点,想通过stackoverflow获得一个整体的想法。



因为本书描述了潜在的并行性:潜在并行性意味着你的程序是这样写的:当并行硬件可用时运行更快



因此,真正的问题是,虽然我使用worker线程的并发,我仍然可以使用可能的并行性? (在工作线程上运行我的feed解析器并仍然将它们分发到cpu核心 - 如果cpu支持多核心)。

解决方案

我认为考虑IO绑定的工作和CPU绑定的工作更有用;线程可以帮助两者。



对于IO绑定的工作,你可能正在等待外部资源(在你的情况下,feed读取)。如果你必须等待多个外部资源,那么只有并行地等待它们而不是一个接一个地等待它们才有意义。这最好通过启动在IO上阻塞的线程来完成。



对于受CPU限制的工作,您希望使用所有内核以最大化完成该工作的吞吐量。为此,您应该创建一个工作程序线程池,其大小与您的内核数量大致相同,并分解和分配它们之间的工作。



在实践中,我发现大多数应用程序都有这两个问题,并且使用线程来解决这两种问题。


For one of my projects thats kind of a content-aggregator i'd like to introduce concurrency and if possible parallelism. At first hand this may seem pointless because concurrency and parallelism take different approaches. (Concurrency via threads introduces immediate concurrency, where as parallelism provides a potential).

So to better explain my problem, let me summarize my problem set.

As my project is a content-aggregator (that aggregates feeds,podcasts and similar stuff) it basically reads the data from web, parses them to return the meaningful data.

So as of right now i took a very simplistic sequential approach. Let's say that we've some amount of feeds we have to parse.

foreach(feed in feeds)
{
   read_from_web(feed)
   parse(feed)
}

So with sequential approach time taken parse all feeds and process them greatly depends on not only the parser code but time needed to get the xml source from web. We all know that it may take variable time to get read the source from web (because of the network conditions and similar issues).

So to speed up the code i can take an approach of worker threads which will introduce an immediate concurrency;

So a defined number of worker threads can take a feed & parse concurrently (which will for sure speed up the whole the process - as we'll see lesser impact of waiting for data over the net).

This is all okay until the point that, my target audience of the project mostly runs multi-core cpus -- because of the fact that they're gamers --.

I want to also utilize these cores while processing the content so started reading on the potential parallelism http://oreilly.com/catalog/0790145310262. I've still not finished reading it yet and don't know if this is already discusses but i'm quite a bit obsessed with this and wanted to ask over stackoverflow to get an overall idea.

So as the book describes potential parallelism: Potential Parallelism means that your program is written so that it runs faster when parallel hardware is available and roughly the same as an equivalent sequential program when it's not.

So the real question is, while i'm using worker threads for concurrency, can i still use possible parallelism? (running my feed parsers on worker threads and still distributing them to cpu cores -- if the cpu supports multi-cores of course)

解决方案

I think it's more useful to think about IO-bound work and CPU-bound work; threads can help with both.

For IO-bound work you are presumably waiting for external resources (in your case, feeds to be read). If you must wait on multiple external resources then it only makes sense to wait on them in parallel rather than wait on them one after the other. This is best done by spinning up threads which block on the IO.

For CPU-bound work you want to use all of your cores to maximize the throughput of completing that work. To do that, you should create a pool of worker threads roughly the same size as your number of cores and break up and distribute the work across them. [How you break up and distribute the work is itself an interesting problem.]

In practice, I find that most applications have both of these problems and it makes sense to use threads to solve both kinds of problems.

这篇关于是否可以一起使用线程并发和并行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆