执行io时akka jvm线程vs os线程 [英] akka jvm threads vs os threads when performing io

查看:106
本文介绍了执行io时akka jvm线程vs os线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经搜索了一些网站,以帮助理解这一点,但是没有发现任何超级明确的内容,所以我想我会发布我的用例,看看是否有人可以解决这些问题。

I've searched the site a bit for help understanding this, but haven't found anything super clear, so I thought I'd post my use case and see if anybody could shed some light.

我有一个关于在akka中用于io操作时jvm线程vs os线程的扩展的问题。来自akka网站:

I have a question about the scaling of jvm threads vs os threads when used in akka for io operations. From the akka site:


Akka支持事件驱动的轻量级线程的调度程序,允许在单个工作站上创建数百万个线程,并且线程基于Actors的Actors,每个调度程序都绑定到一个专用的OS线程。

Akka supports dispatchers for both event-driven lightweight threads, allowing creation of millions threads on a single workstation, and thread-based Actors, where each dispatcher is bound to a dedicated OS thread.

基于事件的Actors当前每个Actor消耗约600个字节,这意味着你可以创建超过4 G RAM上的650万Actors。

The event-based Actors currently consume ~600 bytes per Actor which means that you can create more than 6.5 million Actors on 4 G RAM.

在这种情况下,你能否帮我理解在只有1的工作站上的重要性处理器(为简单起见)。因此,对于我的示例用例,我想要列出1000个用户,然后查询数据库(或多个)以获取有关每个用户的各种信息。因此,如果我将这些'get'任务分配给一个actor,并且该actor将要执行IO,那么该actor是否会根据工作站的os线程限制阻塞?

In this context, can you all help me understand how that matters on a workstation with only 1 processor (for simplicity). So, for my example use case, I want to take a list of say 1000 'Users' and then go query a database (or several) for various information about each user. So if I were to dispatch each of these 'get' tasks to an actor, and that actor is going to do IO, wouldn't that actor block based on the os thread limit for the workstation?

akka演员模型如何让我在这样的场景中解脱?我知道我可能遗漏了一些东西,因为我对vm线程与os线程的交互不是很了解,所以如果这里有一个聪明的人可以为我拼出来,那就太棒了。

How does the akka actor model give me lift in a scenario like this? I know that I am probably missing something as I am not wildly knowledgeable on the interworkings of vm threads vs os threads, so if one of the smart folks here could spell it out for me, that would be great.

如果我使用期货,我是否需要使用await()或get()来阻止并等待回复?

If I use Futures, don't I need to use await() or get() to block and wait for the reply?

在我的用例中,无论演员如何,它最终只是'感觉'就像我正在制作1000个顺序数据库请求一样?

In my use case, regardless of actors, would it end up just 'feeling' like I'm making 1000 sequential database requests?

如果代码剪切对帮助有帮助我理解这一点,Java会更受欢迎,因为我仍然需要加快scala语法的速度 - 但是对于数百万个线程在执行数据库IO时如何在单个处理器计算机上进行互操作的一个很好的明确文本解释也可以。

If code snips are useful in helping me understand this, Java would be preferred as I am still coming up to speed on scala syntax - but a nice clear textual explanation of how these millions of threads can interoperate on a single processor machine while doing database IO would be fine too.

推荐答案

很难弄清楚你在这里问的是什么,但这里有一些指示:

It is really hard to figure out what you are actually asking here, but here are some pointers:


  • 如果你在现代JVM上运行,那就有典型的lly Java线程和OS线程之间的一对一关系。 (IIRC,Solaris允许你以不同的方式做到这一点......但这是例外。)

  • If you are running on a modern JVM, there is typically a one-to-one relationship between Java threads and OS threads. (IIRC, Solaris allows you to do this differently ... but that's the exception.)

使用线程或任何东西获得的真实并行度构建在线程之上的内容受到应用程序可用的处理器/核心数量的限制。除此之外,您会发现并非所有线程在任何给定时刻都在执行。

The amount of real parallelism you will get using threads, or anything built on top of threads is limited by the number of processors / cores that are available to the application. Beyond that, you will find that not all threads are actually executing at any given instant.

如果您有1000个Actors都试图访问数据库同时,然后他们中的大多数实际上将等待数据库本身,或在线程调度程序上。这相当于制作1000个连续请求(即严格序列化)将取决于数据库和演员正在进行的查询/更新。

If you have 1000 Actors all trying to access the database "at the same time", then most of them will actually be waiting on the database itself, or on the thread scheduler. Whether this amounts to making 1000 sequential requests (i.e. strict serialization) will depend on the database and the queries / updates that the actors are doing.

底线是计算机系统对可用资源的硬限制;例如处理器数量,处理器速度,内存带宽,磁盘访问时间,网络带宽等。您可以设计一个应用程序,使其智能化使用可用资源的方式,但是您无法使用更多资源来实现是。

The bottom line is that a computer system has hard limits on the resources available for doing stuff; e.g. number of processors, speed of processors, memory bandwidth, disc access times, network bandwidth, etc. You can design an application to be smart about the way it uses available resources, but you can't get it to use more resources than there actually are.

在阅读你引用的文字时,在我看来它正在谈论两个不同的各种角色

On reading the text that you quoted, it seems to me that it is talking about two different kinds of actors:

基于线程的角色与线程有1对1的关系。你无法在4Gb内存中拥有数百万这样的演员。

Thread-based actors have a 1 to 1 relationship with threads. There's no way you could have millions of this kind of actor in 4Gb memory.

基于事件的演员的工作方式不同。它们不是一直都有线程,而是主要坐在队列中等待事件发生。当发生这种情况时,事件处理线程将从队列中获取actor并执行与该事件相关联的动作。当操作完成时,线程移动到另一个actor /事件对。

Event-based actors work differently. Instead of having a thread at all times, they would mostly be sitting in a queue waiting for an event to happen. When that happened, an event processing thread would grab the actor from the queue and execute the "action" associated with the event. When the action finished, the thread moves onto another actor / event pair.

引用的文本是说基于事件的演员是~600字节。它们不包含事件线程...因为事件线程由多个actor共享。

The quoted text is saying that the memory overhead of an event-based actor is ~600 bytes. They don't include the event thread ... because the event thread is shared by multiple actors.

现在我'我不是Scala / Actors的专家,但很明显在使用基于事件的actor时应该避免某些事情。例如,您应该避免直接与外部数据库交谈,因为这可能会阻止事件处理线程。

Now I'm not an expert on Scala / Actors, but it is pretty obvious that there are certain things that you should avoid when using event-based actors. For instance, you should probably avoid talking directly to an external database because that is liable to block the event processing thread.

这篇关于执行io时akka jvm线程vs os线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆