Perl 中 ithreads(解释器线程)的用例以及使用或不使用它们的基本原理? [英] Use cases for ithreads (interpreter threads) in Perl and rationale for using or not using them?

查看:19
本文介绍了Perl 中 ithreads(解释器线程)的用例以及使用或不使用它们的基本原理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果您想学习如何使用 Perl 解释器线程,perlthrtut(线程教程)线程pragma 联机帮助页.写一些简单的脚本绝对够用了.

然而,我在网络上几乎没有找到关于为什么什么明智地使用 Perl 的解释器线程的指导.事实上,关于它们的讨论并不多,如果人们谈论它们,通常会阻止人们使用它们.

这些线程,当 perl -V:useithreadsuseithreads='define'; 时可用并由 use threads 释放,也称为ithreads,也许更合适,因为它们与 Linux 或 Windows 操作系统或 Java VM 提供的线程非常不同,默认情况下不共享任何内容,而是复制大量数据,不仅仅是线程堆栈,从而显着增加了进程大小.(要查看效果,请在测试脚本中加载一些模块,然后在每次暂停按键的循环中创建线程,并在任务管理器或 top 中观察内存上升.)

<块引用>

[...] 每次启动线程时,所有数据结构都会复制到新线程.当我说全部时,我的意思是全部.这例如包括包藏匿,全局变量,范围内的词法.一切!

-- 在编写 Perl ithreads 之前你需要知道的事情 (Perlmonks 2003)>

在研究 Perl ithreads 的主题时,您会发现人们不鼓励您使用它们(非常糟糕的主意"、从根本上说"有缺陷",或永远不要将 ithread 用于任何事情").

Perl 线程教程强调了Perl 线程是不同的",但并没有过多解释它们有何不同以及这对用户意味着什么.

对 ithread 的真正含义的一个有用但非常简短的解释是 来自 Coro 联机帮助页位于 WINDOWS PROCESS EMULATION 标题下.该模块的作者(Coro - perl 中唯一的真正线程)也不鼓励使用 Perl 解释器线程.

我在某处读到在启用线程的情况下编译 perl 会导致解释器显着变慢.

2003 年有一个 Perlmonks 页面(在编写 Perl ithreads 之前你需要知道的事情),其中作者问:现在你可能想知道为什么 Perl ithreads 没有使用 fork()?这不是更有意义吗?"这似乎是由 forks pragma 的作者编写的.不确定该页面上提供的信息在 2012 年是否仍然适用于较新的 Perls.

以下是我从阅读中提炼出的有关在 Perl 中使用线程的一些指南(可能是错误的):

到目前为止我的研究.现在,感谢您提供更多关于 Perl 线程问题的信息.Perl 中 ithread 的一些合理用例是什么?使用或不使用它们的理由是什么?

解决方案

简短的回答是它们很重(你不能以低廉的价格启动 100 多个),并且它们表现出意想不到的行为(最近在某种程度上减轻了CPAN 模块).

可以通过将 Perl ithread 视为独立的 Actor 来安全地使用它们.

  1. 为工作"创建一个线程::队列::任何.
  2. 启动多个 ithread 和结果"队列,通过关闭将(工作"+自己的结果")队列传递给它们.
  3. 加载(需要)您的应用程序需要的所有剩余代码(不是在线程之前!)
  4. 根据需要将线程的工作添加到队列中.

在工人"ithreads中:

  1. 引入任何通用代码(适用于任何类型的工作)
  2. 阻止队列中的一项工作出队
  3. 按需加载这项工作所需的任何其他依赖项.
  4. 做好工作.
  5. 通过结果"队列将结果传回主线程.
  6. 回到 2.

如果一些worker"线程开始变得有点笨拙,并且您需要将worker"线程限制为某个数量,然后在它们的位置启动新的线程,然后首先创建一个launcher"线程,其工作是启动worker"线程并将它们连接到主线程.

Perl ithreads 的主要问题是什么?

它们对于共享"数据有点不方便,因为您需要明确地进行共享(不是大问题).

您需要注意使用 DESTROY 方法的对象在某个线程中超出范围时的行为(如果它们在另一个线程中仍然需要!)

最重要的:未明确共享的数据/变量被克隆到新线程中.这是一个性能打击,可能根本不是您想要的.解决方法是从几乎原始"的条件(加载的模块不多)启动 ithreads.

IIRC,Threads:: 命名空间中有一些模块可帮助显式依赖项和/或清理新线程的克隆数据.

此外,IIRC 使用 ithread 的模型略有不同,称为公寓"线程,由 Thread::Appartment 实现,它具有不同的使用模式和另一组权衡.

结果:

除非您知道自己在做什么,否则不要使用它们:-)

Fork 可能在 Unix 上效率更高,但 IPC 的故事对于 ithreads 来说要简单得多.(自从我上次查看后,CPAN 模块可能已经缓解了这种情况:-)

它们仍然比 Python 的线程更好.

总有一天,Perl 6 中可能会有更好的东西.

If you want to learn how to use Perl interpreter threads, there's good documentation in perlthrtut (threads tutorial) and the threads pragma manpage. It's definitely good enough to write some simple scripts.

However, I have found little guidance on the web on why and what to sensibly use Perl's interpreter threads for. In fact, there's not much talk about them, and if people talk about them it's quite often to discourage people from using them.

These threads, available when perl -V:useithreads is useithreads='define'; and unleashed by use threads, are also called ithreads, and maybe more appropriately so as they are very different from threads as offered by the Linux or Windows operating systems or the Java VM in that nothing is shared by default and instead a lot of data is copied, not just the thread stack, thus significantly increasing the process size. (To see the effect, load some modules in a test script, then create threads in a loop pausing for key presses each time around, and watch memory rise in task manager or top.)

[...] every time you start a thread all data structures are copied to the new thread. And when I say all, I mean all. This e.g. includes package stashes, global variables, lexicals in scope. Everything!

-- Things you need to know before programming Perl ithreads (Perlmonks 2003)

When researching the subject of Perl ithreads, you'll see people discouraging you from using them ("extremely bad idea", "fundamentally flawed", or "never use ithreads for anything").

The Perl thread tutorial highlights that "Perl Threads Are Different", but it doesn't much bother to explain how they are different and what that means for the user.

A useful but very brief explanation of what ithreads really are is from the Coro manpage under the heading WINDOWS PROCESS EMULATION. The author of that module (Coro - the only real threads in perl) also discourages using Perl interpreter threads.

Somewhere I read that compiling perl with threads enabled will result in a significantly slower interpreter.

There's a Perlmonks page from 2003 (Things you need to know before programming Perl ithreads), in which the author asks: "Now you may wonder why Perl ithreads didn't use fork()? Wouldn't that have made a lot more sense?" This seems to have been written by the author of the forks pragma. Not sure the info given on that page still holds true in 2012 for newer Perls.

Here are some guidelines for usage of threads in Perl I have distilled from my readings (maybe erroneously so):

So far my research. Now, thanks for any more light you can shed on this issue of threads in Perl. What are some sensible use cases for ithreads in Perl? What is the rationale for using or not using them?

解决方案

The short answer is that they're quite heavy (you can't launch 100+ of them cheaply), and they exhibit unexpected behaviours (somewhat mitigated by recent CPAN modules).

You can safely use Perl ithreads by treating them as independent Actors.

  1. Create a Thread::Queue::Any for "work".
  2. Launch multiple ithreads and "result" Queues passing them the ("work" + own "result") Queues by closure.
  3. Load (require) all the remaining code your application requires (not before threads!)
  4. Add work for the threads into the Queue as required.

In "worker" ithreads:

  1. Bring in any common code (for any kind of job)
  2. Blocking-dequeue a piece of work from the Queue
  3. Demand-load any other dependencies required for this piece of work.
  4. Do the work.
  5. Pass the result back to the main thread via the "result" queue.
  6. Back to 2.

If some "worker" threads start to get a little beefy, and you need to limit "worker" threads to some number then launch new ones in their place, then create a "launcher" thread first, whose job it is to launch "worker" threads and hook them up to the main thread.

What are the main problems with Perl ithreads?

They're a little inconvenient with for "shared" data as you need to explicity do the sharing (not a big issue).

You need to look out for the behaviour of objects with DESTROY methods as they go out of scope in some thread (if they're still required in another!)

The big one: Data/variables that aren't explicitly shared are CLONED into new threads. This is a performance hit and probably not at all what you intended. The work around is to launch ithreads from a pretty much "pristine" condition (not many modules loaded).

IIRC, there are modules in the Threads:: namespace that help with making dependencies explicit and/or cleaning up cloned data for new threads.

Also, IIRC, there's a slightly different model using ithreads called "Apartment" threads, implemented by Thread::Appartment which has a different usage pattern and another set of trade-offs.

The upshot:

Don't use them unless you know what you're doing :-)

Fork may be more efficient on Unix, but the IPC story is much simpler for ithreads. (This may have been mitigated by CPAN modules since I last looked :-)

They're still better than Python's threads.

There might, one day, be something much better in Perl 6.

这篇关于Perl 中 ithreads(解释器线程)的用例以及使用或不使用它们的基本原理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆