PHP:写时复制和按引用分配在PHP5和PHP7上执行不同 [英] PHP: Copy On Write and Assign By Reference perform different on PHP5 and PHP7

查看:91
本文介绍了PHP:写时复制和按引用分配在PHP5和PHP7上执行不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一段简单的代码:

We have a piece of simple code:

1    <?php
2    $i = 2;
3    $j = &$i;
4    echo (++$i) + (++$i);

在PHP5上,它输出8,因为:

On PHP5, it outputs 8, because:

$i是一个引用,当我们将$i增加++i时,它将更改zval而不是进行复制,因此第4行将为4 + 4 = 8.这是按引用分配.

$i is a reference, when we increase $i by ++i, it will change the zval rather than make a copy, so line 4 will be 4 + 4 = 8. This is Assign By Reference.

如果我们注释第3行,则每次输出值增加时,它将输出7,PHP将进行复制,第4行为3 + 4 = 7.这是写时复制.

If we comment line 3, it will output 7, every time we change the value by increasing it, PHP will make a copy, line 4 will be 3 + 4 = 7. This is Copy On Write.

但是在PHP7中,它总是输出7.

But in PHP7, it always outputs 7.

我已经检查了PHP7中的更改: http://php.net/manual/en/migration70.incompatible.php ,但我没有任何线索.

I've checked the changes in PHP7: http://php.net/manual/en/migration70.incompatible.php, but I did not get any clue.

任何帮助都将非常有用,在此先感谢您.

Any help will be great, thanks in advance.

更新1

这是PHP5/PHP7上的代码的结果: https://3v4l.org/USTHR

Here is the result of the code on PHP5 / PHP7: https://3v4l.org/USTHR

更新2

操作码:

[huqiu@101 tmp]$ php -d vld.active=1 -d vld.execute=0 -f incr-ref-add.php
Finding entry points
Branch analysis from position: 0
Jump found. Position 1 = -2
filename:       /home/huqiu/tmp/incr-ref-add.php
function name:  (null)
number of ops:  7
compiled vars:  !0 = $i, !1 = $j
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   ASSIGN                                                   !0, 2
   3     1        ASSIGN_REF                                               !1, !0
   4     2        PRE_INC                                          $2      !0
         3        PRE_INC                                          $3      !0
         4        ADD                                              ~4      $2, $3
         5        ECHO                                                     ~4
   5     6      > RETURN                                                   1

branch: #  0; line:     2-    5; sop:     0; eop:     6; out1:  -2
path #1: 0,

推荐答案

免责声明:我不是PHP Internals专家(还可以吗?),这全都是我的理解,不能保证是100%正确或完整的. :)

Disclaimer: I'm not a PHP Internals expert (yet?) so this is all from my understanding, and not guaranteed to be 100% correct or complete. :)

因此,首先,PHP 7的行为(我注意到,HHVM也遵循该行为)似乎是正确的,并且PHP 5在这里有一个错误.此处不应有多余的按引用分配行为,因为无论执行顺序如何,两次调用++$i的结果都不应相同.

So, firstly, the PHP 7 behaviour - which, I note, is also followed by HHVM - appears to be correct, and PHP 5 has a bug here. There should be no extra assign by reference behaviour here, because regardless of execution order, the result of the two calls to ++$i should never be the same.

操作码看起来不错;至关重要的是,我们有两个临时变量$2$3来保存两个增量结果.但是不知何故,PHP 5的运行方式就像我们写的是这样:

The opcodes look fine; crucially, we have two temp variables $2 and $3, to hold the two increment results. But somehow, PHP 5 is acting as though we'd written this:

$i = 2;
$i++; $temp1 =& $i;
$i++; $temp2 =& $i;
echo $temp1 + $temp2; 

而不是:

$i = 2;
$i++; $temp1 = $i;
$i++; $temp2 = $i;
echo $temp1 + $temp2; 

在PHP Internals邮件列表中指出,使用多个操作修改单个语句中的变量通常被认为是未定义的行为",而++在C/C ++中用作示例.

It was pointed out on the PHP Internals mailing list that using multiple operations that modify a variable within a single statement is generally considered "undefined behaviour", and ++ is used as an example of this in C/C++.

因此,对于PHP 5来说,出于实现/优化的原因返回它所做的值是合理的,即使在逻辑上与合理地序列化为多个语句不一致.

As such, it's reasonable for PHP 5 to return the value it does for implementation / optimisation reasons, even if it is logically inconsistent with a sane serialization into multiple statements.

(相对较新) PHP语言规范包含类似的语言和示例:

The (relatively new) PHP language specification contains similar language and examples:

除非在本规范中明确说明,否则未指定表达式中的操作数相对于彼此求值的顺序. [...](例如,未指定完整表达式$j = $i + $i++中的[...],$i的值是旧的还是新的$i.)

Unless stated explicitly in this specification, the order in which the operands in an expression are evaluated relative to each other is unspecified. [...] (For example,[...] in the full expression $j = $i + $i++, whether the value of $i is the old or new $i, is unspecified.)

可以说,这比未定义的行为"要弱,因为它暗示着它们是以某种特定顺序进行评估的,但我们现在开始认真挑选.

Arguably, this is a weaker claim than "undefined behaviour", since it implies they are evaluated in some particular order, but we're into nit-picking now.

我很好奇,想了解更多有关内部的知识,所以也做了一些使用 phpdbg 的游戏.

I was curious, and want to learn more about the internals, so did some playing around using phpdbg.

$j = $i代替$j =& $i运行代码,我们从共享地址的2个变量开始,其引用计数为2(但没有is_ref标志):

Running the code with $j = $i in place of $j =& $i, we start with 2 variables sharing an address, with a refcount of 2 (but no is_ref flag):

Address         Refs    Type            Variable
0x7f3272a83be8  2       (integer)       $i
0x7f3272a83be8  2       (integer)       $j

但是,一旦您预先递增,zval就被分离,只有一个临时变量与$ i共享,引用数为2:

But as soon as you pre-increment, the zvals are separated, and only one temp var is sharing with $i, giving a refcount of 2:

Address         Refs    Type            Variable
0x7f189f9ecfc8  2       (integer)       $i
0x7f189f859be8  1       (integer)       $j

参考分配

将变量绑定在一起后,它们共享一个地址,其引用计数为2,并带有一个by-ref标记:

With reference assignment

When the variables have been bound together, they share an address, with a refcount of 2, and a by-ref marker:

Address         Refs    Type            Variable
0x7f9e04ee7fd0  2       (integer)       &$i
0x7f9e04ee7fd0  2       (integer)       &$j

在预递增之后(但在加法之前),同一地址的引用计数为4,显示错误地由引用绑定的2个临时变量:

After the pre-increments (but before the addition), the same address has a refcount of 4, showing the 2 temp vars erroneously bound by reference:

Address         Refs    Type            Variable
0x7f9e04ee7fd0  4       (integer)       &$i
0x7f9e04ee7fd0  4       (integer)       &$j

问题根源

http://lxr.php.net 上查看源代码,我们可以找到实现ZEND_PRE_INC操作码:

The source of the issue

Digging into the source on http://lxr.php.net, we can find the implementation of the ZEND_PRE_INC opcode:

  • PHP 5.6
  • PHP 7.0

关键是:

 SEPARATE_ZVAL_IF_NOT_REF(var_ptr);

因此,仅当当前值不是引用时,我们才为结果值创建一个新的zval.再往下看,我们有这个:

So we create a new zval for the result value only if it is not currently a reference. Further down, we have this:

if (RETURN_VALUE_USED(opline)) {
    PZVAL_LOCK(*var_ptr);
    EX_T(opline->result.var).var.ptr = *var_ptr;
}

因此,如果实际使用了减量的返回值,则我们需要锁定" zval,在将其分配为结果之前,跟随一系列宏基本上意味着增加其引用计数".

So if the return value of the decrement is actually used, we need to "lock" the zval, which following a whole series of macros basically means "increment its refcount", before assigning it as the result.

如果我们早些时候创建了一个新的zval,那很好-我们的引用计数现在为2,实际变量为1,操作结果为1.但是,如果我们决定不这样做,因为我们需要保留一个引用,那么我们只是在增加现有引用计数,并指向一个可能即将再次更改的zval.

If we created a new zval earlier, that's fine - our refcount is now 2, 1 for the actual variable, plus 1 for the operation result. But if we decided not to, because we needed to hold a reference, we're just incrementing the existing reference count, and pointing at a zval which may be about to be changed again.

那么PHP 7有什么不同?有几件事!

So what's different in PHP 7? Several things!

首先,由于在PHP 7中不再对整数进行计数,因此phpdbg的输出相当无聊.相反,引用分配会创建一个额外的指针,该指针本身的引用计数为1,指向内存中的同一地址,即实际的整数. phpdbg输出看起来像这样:

Firstly, the phpdbg output is rather boring, because integers are no longer reference counted in PHP 7; instead, a reference assignment creates an extra pointer, which itself has a refcount of 1, to the same address in memory, which is the actual integer. The phpdbg output looks like this:

Address            Refs    Type      Variable
0x7f175ca660e8     1       integer   &$i
int (2)
0x7f175ca660e8     1       integer   &$j
int (2)

第二,在源代码中有一个特殊的代码路径表示整数:

Secondly, there is a special code path in the source for integers:

if (EXPECTED(Z_TYPE_P(var_ptr) == IS_LONG)) {
    fast_long_increment_function(var_ptr);
    if (UNEXPECTED(RETURN_VALUE_USED(opline))) {
        ZVAL_COPY_VALUE(EX_VAR(opline->result.var), var_ptr);
    }
    ZEND_VM_NEXT_OPCODE();
}

因此,如果变量是整数(IS_LONG)而不是对整数的引用(IS_REFERENCE),那么我们只需将其递增就位.如果随后需要返回值,则可以将其值复制到结果(ZVAL_COPY_VALUE)中.

So if the variable is an integer (IS_LONG) and not a reference to an integer (IS_REFERENCE) then we can just increment it in place. If we then need the return value, we can copy its value into the result (ZVAL_COPY_VALUE).

如果是引用,我们不会打那些代码,但是,我们没有将引用绑定在一起,而是这两行:

If it's a reference, we won't hit that code, but rather than keeping references bound together, we have these two lines:

ZVAL_DEREF(var_ptr);
SEPARATE_ZVAL_NOREF(var_ptr);

第一行说如果是参考,请遵循它的目标";这使我们从对整数的引用"变为整数本身.第二个是我认为-说:如果引用了某项内容,并且有多个引用,请为其创建副本";在我们的例子中,这不会做任何事情,因为整数并不关心引用计数.

The first line says "if it's a reference, follow it to its target"; this takes us from our "reference to an integer" to the integer itself. The second - I think - says "if it's something refcounted, and has more than one reference, create a copy of it"; in our case, this will do nothing, because the integer doesn't care about refcounts.

因此,现在我们有一个可以减小的整数,它将影响所有按引用的关联,但不会影响按引用类型的按值关联.最后,如果我们想要增量的返回值,我们再次 copy ,而不是仅仅分配它;这次有一个稍微不同的宏,如果需要的话,它将增加我们新zval的引用计数:

So now we have an integer we can decrement, that will affect all by-reference associations, but not by-value ones for refcounted types. Finally, if we want the return value of the increment, we again copy it, rather than just assigning it; and this time with a slightly different macro which will increase the refcount of our new zval if necessary:

ZVAL_COPY(EX_VAR(opline->result.var), var_ptr);

这篇关于PHP:写时复制和按引用分配在PHP5和PHP7上执行不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆