PHP:写时复制和按引用分配在PHP5和PHP7上执行不同 [英] PHP: Copy On Write and Assign By Reference perform different on PHP5 and PHP7
问题描述
我们有一段简单的代码:
We have a piece of simple code:
1 <?php
2 $i = 2;
3 $j = &$i;
4 echo (++$i) + (++$i);
在PHP5上,它输出8,因为:
On PHP5, it outputs 8, because:
$i
是一个引用,当我们将$i
增加++i
时,它将更改zval
而不是进行复制,因此第4行将为4 + 4 = 8
.这是按引用分配.
$i
is a reference, when we increase $i
by ++i
, it will change the zval
rather than make a copy, so line 4 will be 4 + 4 = 8
. This is Assign By Reference.
如果我们注释第3行,则每次输出值增加时,它将输出7,PHP将进行复制,第4行为3 + 4 = 7
.这是写时复制.
If we comment line 3, it will output 7, every time we change the value by increasing it, PHP will make a copy, line 4 will be 3 + 4 = 7
. This is Copy On Write.
但是在PHP7中,它总是输出7.
But in PHP7, it always outputs 7.
我已经检查了PHP7中的更改: http://php.net/manual/en/migration70.incompatible.php ,但我没有任何线索.
I've checked the changes in PHP7: http://php.net/manual/en/migration70.incompatible.php, but I did not get any clue.
任何帮助都将非常有用,在此先感谢您.
Any help will be great, thanks in advance.
更新1
这是PHP5/PHP7上的代码的结果: https://3v4l.org/USTHR
Here is the result of the code on PHP5 / PHP7: https://3v4l.org/USTHR
更新2
操作码:
[huqiu@101 tmp]$ php -d vld.active=1 -d vld.execute=0 -f incr-ref-add.php
Finding entry points
Branch analysis from position: 0
Jump found. Position 1 = -2
filename: /home/huqiu/tmp/incr-ref-add.php
function name: (null)
number of ops: 7
compiled vars: !0 = $i, !1 = $j
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN !0, 2
3 1 ASSIGN_REF !1, !0
4 2 PRE_INC $2 !0
3 PRE_INC $3 !0
4 ADD ~4 $2, $3
5 ECHO ~4
5 6 > RETURN 1
branch: # 0; line: 2- 5; sop: 0; eop: 6; out1: -2
path #1: 0,
推荐答案
免责声明:我不是PHP Internals专家(还可以吗?),这全都是我的理解,不能保证是100%正确或完整的. :)
Disclaimer: I'm not a PHP Internals expert (yet?) so this is all from my understanding, and not guaranteed to be 100% correct or complete. :)
因此,首先,PHP 7的行为(我注意到,HHVM也遵循该行为)似乎是正确的,并且PHP 5在这里有一个错误.此处不应有多余的按引用分配行为,因为无论执行顺序如何,两次调用++$i
的结果都不应相同.
So, firstly, the PHP 7 behaviour - which, I note, is also followed by HHVM - appears to be correct, and PHP 5 has a bug here. There should be no extra assign by reference behaviour here, because regardless of execution order, the result of the two calls to ++$i
should never be the same.
操作码看起来不错;至关重要的是,我们有两个临时变量$2
和$3
来保存两个增量结果.但是不知何故,PHP 5的运行方式就像我们写的是这样:
The opcodes look fine; crucially, we have two temp variables $2
and $3
, to hold the two increment results. But somehow, PHP 5 is acting as though we'd written this:
$i = 2;
$i++; $temp1 =& $i;
$i++; $temp2 =& $i;
echo $temp1 + $temp2;
而不是:
$i = 2;
$i++; $temp1 = $i;
$i++; $temp2 = $i;
echo $temp1 + $temp2;
在PHP Internals邮件列表中指出,使用多个操作修改单个语句中的变量通常被认为是未定义的行为",而++
是在C/C ++中用作示例.
It was pointed out on the PHP Internals mailing list that using multiple operations that modify a variable within a single statement is generally considered "undefined behaviour", and ++
is used as an example of this in C/C++.
因此,对于PHP 5来说,出于实现/优化的原因返回它所做的值是合理的,即使在逻辑上与合理地序列化为多个语句不一致.
As such, it's reasonable for PHP 5 to return the value it does for implementation / optimisation reasons, even if it is logically inconsistent with a sane serialization into multiple statements.
(相对较新) PHP语言规范包含类似的语言和示例:
The (relatively new) PHP language specification contains similar language and examples:
除非在本规范中明确说明,否则未指定表达式中的操作数相对于彼此求值的顺序. [...](例如,未指定完整表达式
$j = $i + $i++
中的[...],$i
的值是旧的还是新的$i
.)
Unless stated explicitly in this specification, the order in which the operands in an expression are evaluated relative to each other is unspecified. [...] (For example,[...] in the full expression
$j = $i + $i++
, whether the value of$i
is the old or new$i
, is unspecified.)
可以说,这比未定义的行为"要弱,因为它暗示着它们是以某种特定顺序进行评估的,但我们现在开始认真挑选.
Arguably, this is a weaker claim than "undefined behaviour", since it implies they are evaluated in some particular order, but we're into nit-picking now.
我很好奇,想了解更多有关内部的知识,所以也做了一些使用 phpdbg 的游戏.
I was curious, and want to learn more about the internals, so did some playing around using phpdbg.
用$j = $i
代替$j =& $i
运行代码,我们从共享地址的2个变量开始,其引用计数为2(但没有is_ref标志):
Running the code with $j = $i
in place of $j =& $i
, we start with 2 variables sharing an address, with a refcount of 2 (but no is_ref flag):
Address Refs Type Variable
0x7f3272a83be8 2 (integer) $i
0x7f3272a83be8 2 (integer) $j
但是,一旦您预先递增,zval就被分离,只有一个临时变量与$ i共享,引用数为2:
But as soon as you pre-increment, the zvals are separated, and only one temp var is sharing with $i, giving a refcount of 2:
Address Refs Type Variable
0x7f189f9ecfc8 2 (integer) $i
0x7f189f859be8 1 (integer) $j
参考分配
将变量绑定在一起后,它们共享一个地址,其引用计数为2,并带有一个by-ref标记:
With reference assignment
When the variables have been bound together, they share an address, with a refcount of 2, and a by-ref marker:
Address Refs Type Variable
0x7f9e04ee7fd0 2 (integer) &$i
0x7f9e04ee7fd0 2 (integer) &$j
在预递增之后(但在加法之前),同一地址的引用计数为4,显示错误地由引用绑定的2个临时变量:
After the pre-increments (but before the addition), the same address has a refcount of 4, showing the 2 temp vars erroneously bound by reference:
Address Refs Type Variable
0x7f9e04ee7fd0 4 (integer) &$i
0x7f9e04ee7fd0 4 (integer) &$j
问题根源
在 http://lxr.php.net 上查看源代码,我们可以找到实现ZEND_PRE_INC
操作码:
The source of the issue
Digging into the source on http://lxr.php.net, we can find the implementation of the ZEND_PRE_INC
opcode:
- PHP 5.6
- PHP 7.0
关键是:
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
因此,仅当当前值不是引用时,我们才为结果值创建一个新的zval.再往下看,我们有这个:
So we create a new zval for the result value only if it is not currently a reference. Further down, we have this:
if (RETURN_VALUE_USED(opline)) {
PZVAL_LOCK(*var_ptr);
EX_T(opline->result.var).var.ptr = *var_ptr;
}
因此,如果实际使用了减量的返回值,则我们需要锁定" zval,在将其分配为结果之前,跟随一系列宏基本上意味着增加其引用计数".
So if the return value of the decrement is actually used, we need to "lock" the zval, which following a whole series of macros basically means "increment its refcount", before assigning it as the result.
如果我们早些时候创建了一个新的zval,那很好-我们的引用计数现在为2,实际变量为1,操作结果为1.但是,如果我们决定不这样做,因为我们需要保留一个引用,那么我们只是在增加现有引用计数,并指向一个可能即将再次更改的zval.
If we created a new zval earlier, that's fine - our refcount is now 2, 1 for the actual variable, plus 1 for the operation result. But if we decided not to, because we needed to hold a reference, we're just incrementing the existing reference count, and pointing at a zval which may be about to be changed again.
那么PHP 7有什么不同?有几件事!
So what's different in PHP 7? Several things!
首先,由于在PHP 7中不再对整数进行计数,因此phpdbg的输出相当无聊.相反,引用分配会创建一个额外的指针,该指针本身的引用计数为1,指向内存中的同一地址,即实际的整数. phpdbg输出看起来像这样:
Firstly, the phpdbg output is rather boring, because integers are no longer reference counted in PHP 7; instead, a reference assignment creates an extra pointer, which itself has a refcount of 1, to the same address in memory, which is the actual integer. The phpdbg output looks like this:
Address Refs Type Variable
0x7f175ca660e8 1 integer &$i
int (2)
0x7f175ca660e8 1 integer &$j
int (2)
第二,在源代码中有一个特殊的代码路径表示整数:
Secondly, there is a special code path in the source for integers:
if (EXPECTED(Z_TYPE_P(var_ptr) == IS_LONG)) {
fast_long_increment_function(var_ptr);
if (UNEXPECTED(RETURN_VALUE_USED(opline))) {
ZVAL_COPY_VALUE(EX_VAR(opline->result.var), var_ptr);
}
ZEND_VM_NEXT_OPCODE();
}
因此,如果变量是整数(IS_LONG
)而不是对整数的引用(IS_REFERENCE
),那么我们只需将其递增就位.如果随后需要返回值,则可以将其值复制到结果(ZVAL_COPY_VALUE
)中.
So if the variable is an integer (IS_LONG
) and not a reference to an integer (IS_REFERENCE
) then we can just increment it in place. If we then need the return value, we can copy its value into the result (ZVAL_COPY_VALUE
).
如果是引用,我们不会打那些代码,但是,我们没有将引用绑定在一起,而是这两行:
If it's a reference, we won't hit that code, but rather than keeping references bound together, we have these two lines:
ZVAL_DEREF(var_ptr);
SEPARATE_ZVAL_NOREF(var_ptr);
第一行说如果是参考,请遵循它的目标";这使我们从对整数的引用"变为整数本身.第二个是我认为-说:如果引用了某项内容,并且有多个引用,请为其创建副本";在我们的例子中,这不会做任何事情,因为整数并不关心引用计数.
The first line says "if it's a reference, follow it to its target"; this takes us from our "reference to an integer" to the integer itself. The second - I think - says "if it's something refcounted, and has more than one reference, create a copy of it"; in our case, this will do nothing, because the integer doesn't care about refcounts.
因此,现在我们有一个可以减小的整数,它将影响所有按引用的关联,但不会影响按引用类型的按值关联.最后,如果我们想要增量的返回值,我们再次 copy ,而不是仅仅分配它;这次有一个稍微不同的宏,如果需要的话,它将增加我们新zval的引用计数:
So now we have an integer we can decrement, that will affect all by-reference associations, but not by-value ones for refcounted types. Finally, if we want the return value of the increment, we again copy it, rather than just assigning it; and this time with a slightly different macro which will increase the refcount of our new zval if necessary:
ZVAL_COPY(EX_VAR(opline->result.var), var_ptr);
这篇关于PHP:写时复制和按引用分配在PHP5和PHP7上执行不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!