保持 Perl 内存使用率低的技巧 [英] Tips for keeping Perl memory usage low

查看:26
本文介绍了保持 Perl 内存使用率低的技巧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Perl 脚本中保持低内存使用的一些好技巧是什么?我有兴趣学习如何为依赖 Perl 程序的系统保持尽可能低的内存占用.我知道 Perl 在内存使用方面不是很好,但我想知道是否有任何改进它的技巧.

那么,您可以怎样做才能使 Perl 脚本使用更少的内存.我对任何建议都很感兴趣,无论它们是编写代码的实际技巧,还是有关如何以不同方式编译 Perl 的技巧.

为赏金我有一个用作网络应用程序服务器的 perl 程序.连接到它的每个客户端当前都有自己的子进程.我也使用了线程而不是 fork,但我无法确定使用线程而不是 fork 是否实际上更有效.

我想再次尝试使用线程而不是分叉.我相信理论上它应该节省内存使用量.我在这方面有几个问题:

  1. 在 Perl 中创建的线程是否防止复制 Perl 模块库到每个线程的内存中?
  2. 线程(使用线程)是最有效的方式(或唯一)在 Perl 中创建线程的方法?
  3. 在线程中,我可以指定一个 stack_size 参数,具体是什么在指定这个值时我应该考虑,它是如何影响的内存使用情况?

对于 Perl/Linux 中的线程,在每个线程的基础上确定实际内存使用情况的最可靠方法是什么?

解决方案

你遇到了什么样的问题,什么是大"?对你是什么意思?我有朋友需要将 200 Gb 的文件加载到内存中,所以他们的好技巧的想法与预算购买者的想法大不相同,他们购买了 250 Mb RAM 的最小 VM 切片(真的吗?我的手机有更多).

一般来说,Perl 会保留您使用的任何内存,即使它没有使用它.意识到在一个方向上进行优化,例如内存,可能会对另一个产生负面影响,例如速度.

这不是一个完整的列表(Programming Perl 中有更多内容):

☹ 使用 Perl 内存分析工具来帮助您找到问题区域.请参阅分析 perl 程序上的堆内存使用情况如何在Perl中找到一个hash占用的物理内存量?>

☹ 使用尽可能小的范围的词法变量,以允许 Perl 在您不需要时重新使用该内存.

☹ 避免建造大型临时结构.例如,使用 foreach 读取文件一次读取所有输入.如果你只需要一行一行,使用 while.

 foreach (  ) { ... } # 一次列出上下文while( <FILE> ) { ... } # 标量上下文,逐行

☹您甚至可能不需要将文件保存在内存中.内存映射文件而不是slurping它们

☹ 如果您需要创建大数据结构,请考虑DBM::Deep 或其他存储引擎,以将大部分内容保留在 RAM 和磁盘上,直到您需要它为止.在 Perl 之外,还有各种键值存储,例如 Redis,可能会有所帮助.

☹ 不要让人们使用您的程序.每当我这样做时,我就将内存占用减少了大约 100%.它还减少了支持请求.

☹(更新:Perl 现在可以在大多数情况下为您处理这个问题,因为它使用写时复制 (COW) 机制)通过引用传递大块文本和大集合,这样您就不会't 制作副本,从而将相同的信息存储两次.如果您因为想更改某些内容而必须复制它,那么您可能会卡住.这作为子例程参数和子例程返回值是双向的:

 call_some_sub( $big_text, @long_array );sub call_some_sub {我的( $text_ref, $array_ref ) = @_;...返回 \%hash;}

☹ 追踪模块中的内存泄漏.我在应用程序上遇到了大问题,直到我意识到 一个模块没有释放内存.我在模块的RT队列中找到了一个补丁,应用了它,解决了问题.

☹ 如果您需要一次性处理大量数据,但又不想占用持久内存,请将工作卸载到子进程.子进程在工作时只有内存占用.当你得到答案时,子进程关闭并释放它的内存.同样,工作分配系统,例如 Minion,可以在机器之间分配工作.

☹ 将递归解决方案转化为迭代解决方案.Perl 没有尾递归优化,因此每个新调用都会添加到调用堆栈中.您可以使用 goto 或一个模块的技巧自己优化尾部问题,但这是一个需要做大量工作才能掌握您可能不需要的技术.

☹ 使用外部程序、分叉、作业队列或其他单独的参与者,这样您就不必承担短期记忆负担.如果您有一个将使用大量内存的处理任务,请让不同的程序(可能是当前程序的分支)处理该任务并返回答案.当另一个程序完成时,它的所有内存都返回给操作系统.这个程序甚至不需要在同一个盒子上.

☹ 他使用的是 6 Gb 还是仅使用了 5 个?好吧,说实话,在所有这些兴奋中,我有点迷失了自己.但是,由于这是世界上最强大的语言 Perl,并且会让您的记忆力大打折扣,因此您必须问自己一个问题:我感到幸运吗?嗯,是吗,朋克?

还有更多,但要弄清楚这些是什么还为时过早.我在 掌握 Perl有效的 Perl 编程.

What are some good tips for keeping memory usage low in a Perl script? I am interested in learning how to keep my memory footprint as low as possible for systems depending on Perl programs. I know Perl isn't great when it comes to memory usage, but I'd like to know if there are any tips for improving it.

So, what can you do to keep a Perl script using less memory. I'm interested in any suggestions, whether they are actual tips for writing code, or tips for how to compile Perl differently.

Edit for Bounty: I have a perl program that serves as a server for a network application. Each client that connects to it gets it's own child process currently. I've used threads instead of forks as well, but I haven't been able to determine if using threads instead of forks is actually more memory efficient.

I'd like to try using threads instead of forks again. I believe in theory it should save on memory usage. I have a few questions in that regard:

  1. Do threads created in Perl prevent copying Perl module libraries into memory for each thread?
  2. Is threads (use threads) the most efficient way (or the only) way to create threads in Perl?
  3. In threads, I can specify a stack_size paramater, what specifically should I consider when specifying this value, and how does it impact memory usage?

With threads in Perl/Linux, what is the most reliable method to determine the actual memory usage on a per-thread basis?

解决方案

What sort of problem are you running into, and what does "large" mean to you? I have friends you need to load 200 Gb files into memory, so their idea of good tips is a lot different than the budget shopper for minimal VM slices suffering with 250 Mb of RAM (really? My phone has more than that).

In general, Perl holds on to any memory you use, even if it's not using it. Realize that optimizing in one direction, e.g. memory, might negatively impact another, such as speed.

This is not a comprehensive list (and there's more in Programming Perl):

☹ Use Perl memory profiling tools to help you find problem areas. See Profiling heap memory usage on perl programs and How to find the amount of physical memory occupied by a hash in Perl?

☹ Use lexical variables with the smallest scope possible to allow Perl to re-use that memory when you don't need it.

☹ Avoid creating big temporary structures. For instance, reading a file with a foreach reads all the input at once. If you only need it line-by-line, use while.

 foreach ( <FILE> ) { ... } # list context, all at once 
 while( <FILE> ) { ... } # scalar context, line by line

☹ You might not even need to have the file in memory. Memory-map files instead of slurping them

☹ If you need to create big data structures, consider something like DBM::Deep or other storage engines to keep most of it out of RAM and on disk until you need it. Outside of Perl, there are various key-value stores, such as Redis, that may help.

☹ Don't let people use your program. Whenever I've done that, I've reduced the memory footprint by about 100%. It also cuts down on support requests.

☹ (Update: Perl can now handle this for you in most cases because it uses a Copy On Write (COW) mechanism) Pass large chunks of text and large aggregates by reference so you don't make a copy, thus storing the same information twice. If you have to copy it because you want to change something, you might be stuck. This goes both ways as subroutine arguments and subroutine return values:

 call_some_sub( $big_text, @long_array );
 sub call_some_sub {
      my( $text_ref, $array_ref ) = @_;
      ...
      return \%hash;
      }

☹ Track down memory leaks in modules. I had big problems with an application until I realized that a module wasn't releasing memory. I found a patch in the module's RT queue, applied it, and solved the problem.

☹ If you need to handle a big chunk of data once but don't want the persistent memory footprint, offload the work to a child process. The child process only has the memory footprint while it's working. When you get the answer, the child process shuts down and releases it memory. Similarly, work distribution systems, such as Minion, can spread work out among machines.

☹ Turn recursive solutions into iterative ones. Perl doesn't have tail recursion optimization, so every new call adds to the call stack. You can optimize the tail problem yourself with tricks with goto or a module, but that's a lot of work to hang onto a technique that you probably don't need.

☹ Use external programs, forks, job queues, or other separate actors so you don't have to carry around short-term memory burdens. If you have a have processing task that will use a big chunk of memory, let a different program (perhaps a fork of the current program) handle that and give you back the answer. When that other program is done, all of its memory returns to the operating system. This program doesn't even need to be on the same box.

☹ Did he use 6 Gb or only five? Well, to tell you the truth, in all this excitement I kind of lost track myself. But being as this is Perl, the most powerful language in the world, and would blow your memory clean off, you've got to ask yourself one question: Do I feel lucky? Well, do ya, punk?

There are many more, but it's too early in the morning to figure out what those are. I cover some in Mastering Perl and Effective Perl Programming.

这篇关于保持 Perl 内存使用率低的技巧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆