Perl:什么时候释放不需要的标量内存而不会超出范围? [英] Perl: When is unneeded memory of a scalar freed without going out of scope?

查看:87
本文介绍了Perl:什么时候释放不需要的标量内存而不会超出范围?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个应用程序,可以将大量文本数据读入标量中,有时甚至可以读取GB大小.我在该标量上使用substr将大多数数据读入另一个标量,并将提取的数据替换为空字符串,因为在第一个标量中不再需要它.我最近发现的是,Perl认识到它的逻辑长度已改变,但并未释放第一个标量的内存.因此,我需要做的是再次将数据从第一个标量提取到第三个标量,然后再次将提取的数据放回原处.只有这样,才能真正释放第一个标量所占用的内存.将undef分配给该标量或其他小于分配的内存块的值不会改变分配的内存.

I have an app which reads a giant chunk of textual data into a scalar, sometimes even GBs in size. I use substr on that scalar to read most of the data into another scalar and replace the extracted data with an empty string, because it is not needed in the first scalar anymore. What I've found recently was that Perl is not freeing the memory of the first scalar, while it is recognizing that its logical length has changed. So what I need to do is extract the data from the first scalar into a third again, undef the first scalar und put the extracted data back in place. Only this way the memory occupied by the first scalar is really freed up. Assigning undef to that scalar or some other value less than the allocated block of memory doesn't change anything about the allocated memory.

以下是我现在要做的:

     $$extFileBufferRef = substr($$contentRef, $offset, $length, '');
     $length            = length($$contentRef);
  my $content           = substr($$contentRef, 0, $length);
     $$contentRef       = undef( $$contentRef) || $content;

$$contentRef可能是第一行的大小为5 GB,我提取了4.9 GB的数据并替换了提取的数据.第二行现在将报告100 MB的数据作为字符串的长度,但是Devel::Size::total_size仍将输出为该标量分配5 GB的数据.将undef等分配给$$contentRef似乎并没有改变,我需要在该标量上将undef作为函数调用.

$$contentRef might be e.g. 5 GBs in size in the first line, I extract 4,9 GB of data and replace the extracted data. The second line would now report e.g. 100 MBs of data as the length of the string, but e.g. Devel::Size::total_size would still output that 5 GB of data are allocated for that scalar. And assigning undef or such to $$contentRef doesn't seem to change a thing about that, I need to call undef as a function on that scalar.

我希望在应用substr之后,至少已经释放了$$contentRef后面的内存.似乎并非如此...

I would have expected that the memory behind $$contentRef is already at least partially freed after substr was applied. Doesn't seem to be the case...

那么,仅在变量超出范围时才释放内存吗?如果是这样,为什么分配undef与在同一标量上将undef作为函数调用不同?

So, is memory only freed if variables go out of scope? And if so, why is assigning undef different to calling undef as a function on the same scalar?

推荐答案

您的分析是正确的.

$ perl -MDevel::Peek -e'
   my $x; $x .= "x" for 1..100;
   Dump($x);
   substr($x, 50, length($x), "");
   Dump($x);
'
SV = PV(0x24208e0) at 0x243d550
  ...
  CUR = 100       # length($x) == 100
  LEN = 120       # 120 bytes are allocated for the string buffer.

SV = PV(0x24208e0) at 0x243d550
  ...
  CUR = 50        # length($x) == 50
  LEN = 120       # 120 bytes are allocated for the string buffer.

Perl不仅对字符串进行整体化,甚至不释放超出范围的变量,而是在下次进入范围时重新使用它们.

Not only does Perl overallocate strings, it doesn't even free variables that go out of scope, instead reusing them the next time the scope is entered.

$ perl -MDevel::Peek -e'
   sub f {
      my ($set) = @_;
      my $x;
      if ($set) { $x = "abc"; $x .= "def"; }
      Dump($x);
   }

   f(1);
   f(0);
'
SV = PV(0x3be74b0) at 0x3c04228   # PV: Scalar may contain a string
  REFCNT = 1
  FLAGS = (POK,pPOK)              # POK: Scalar contains a string
  PV = 0x3c0c6a0 "abcdef"\0       # The string buffer
  CUR = 6
  LEN = 10                        # Allocated size of the string buffer

SV = PV(0x3be74b0) at 0x3c04228   # Could be a different scalar at the same address,
  REFCNT = 1                      #   but it's truly the same scalar
  FLAGS = ()                      # No "OK" flags: undef
  PV = 0x3c0c6a0 "abcdef"\0       # The same string buffer
  CUR = 6
  LEN = 10                        # Allocated size of the string buffer

逻辑是,如果您一次需要内存,那么很有可能再次需要它.

The logic is that if you needed the memory once, there's a strong chance you'll need it again.

出于相同的原因,将undef分配给标量不会释放其字符串缓冲区.但是Perl允许您随意释放缓冲区,因此将标量传递给undef确实会强制释放标量的内部缓冲区.

For the same reason, assigning undef to a scalar doesn't free its string buffer. But Perl gives you a chance to free the buffers if you want, so passing a scalar to undef does force the freeing of the scalar's internal buffers.

$ perl -MDevel::Peek -e'
   my $x = "abc"; $x .= "def";  Dump($x);
   $x = undef;                  Dump($x);
   undef $x;                    Dump($x);
'
SV = PV(0x37d1fb0) at 0x37eec98   # PV: Scalar may contain a string
  REFCNT = 1
  FLAGS = (POK,pPOK)              # POK: Scalar contains a string
  PV = 0x37e8290 "abcdef"\0       # The string buffer
  CUR = 6
  LEN = 10                        # Allocated size of the string buffer

SV = PV(0x37d1fb0) at 0x37eec98   # PV: Scalar may contain a string
  REFCNT = 1
  FLAGS = ()                      # No "OK" flags: undef
  PV = 0x37e8290 "abcdef"\0       # The string buffer is still allcoated
  CUR = 6
  LEN = 10                        # Allocated size of the string buffer

SV = PV(0x37d1fb0) at 0x37eec98   # PV: Scalar may contain a string
  REFCNT = 1
  FLAGS = ()                      # No "OK" flags: undef
  PV = 0                          # The string buffer has been freed.

这篇关于Perl:什么时候释放不需要的标量内存而不会超出范围?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆