如何PHP内存的实际工作 [英] How does PHP memory actually work

查看:266
本文介绍了如何PHP内存的实际工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直听说,并寻找新的PHP良好的写作实践,例如:这是更好(性能),以检查是否比数组查找存在数组的键,而且似乎对内存也更好:

I've always heard and searched for new php 'good writing practice', for example: It's better (for performance) to check if array key exists than search in array, but also it seems better for memory too:

假设我们有:

$array = array
(
    'one'   => 1,
    'two'   => 2,
    'three' => 3,
    'four'  => 4,
);

此分配1040个字节的内存,

this allocates 1040 bytes of memory,

$array = array
(
    1 => 'one',
    2 => 'two',
    3 => 'three',
    4 => 'four',
);

1136需要字节

requires 1136 bytes

据我所知,肯定会有不同的存储机制,但
请你能真正指出我的原则,它是如何工作的?

I understand that the key and value surely will have different storing mechanism, but please can you actually point me to the principle how does it work?

例2的(对于@teuneboon)的:

Example 2 (for @teuneboon):

$array = array
(
    'one'   => '1',
    'two'   => '2',
    'three' => '3',
    'four'  => '4',
);

1168字节

$array = array
(
    '1' => 'one',
    '2' => 'two',
    '3' => 'three',
    '4' => 'four',
);

1136字节

消耗相同的内存:


  • 4 => 四,

  • '4'=> 四,

  • 4 => 'four',
  • '4' => 'four',

推荐答案

<子>的注意,回答下面的PHP是适用的之前以7版本的PHP引入7大变化这也涉及到价值观的结构。

Note, answer below is applicable for PHP prior to version 7 as in PHP 7 major changes were introduced which also involve values structures.

您的问题实际上不是关于是如何存储在PHP作品的(在这里,我想,你的意思是内存分配),但有关的如何数组在PHP中的工作的 - 并且这两个问题是不同的。总结写了些什么如下:

Your question is not actually about "how memory works in PHP" (here, I assume, you meant "memory allocation"), but about "how arrays work in PHP" - and these two questions are different. To summarize what's written below:


  • PHP数组不是经典意义上的阵列。他们是哈希地图

  • 哈希的地图PHP数组具有特定的结构和使用许多额外的存储的东西,比如内部链接指针

  • 对于PHP哈希地图
  • 哈希地图项目也使用其他字段来存储信息。而且 - 是的,不仅是字符串/整数键的问题,还能有什么是字符串本身,这是用于你的钥匙

  • 选项在你的情况下,内存的容量条款将赢字符串键,因为这两个选项将被散列到 ULONG (无符号长)键散列的地图,所以真正的区别将在值,其中字符串键选项有整数(定长)值,而整项选项字符串(取决于字符长度)值。但事实可能并非永远是真实的,由于可能发生的碰撞。

  • 串数字键,如4,将被视为整数键并翻译成整数哈希结果,因为它是整数键。因此,'4'=&GT;'富' 4 =&GT; '富'都是一样的东西。

  • PHP arrays aren't "arrays" in classical sense. They are hash-maps
  • Hash-map for PHP array has specific structure and uses many additional storage things, such as internal links pointers
  • Hash-map items for PHP hash-map also use additional fields to store information. And - yes, not only string/integer keys matters, but also what are strings themselves, which are used for your keys.
  • Option with string keys in your case will "win" in terms of memory amount because both options will be hashed into ulong (unsigned long) keys hash-map, so real difference will be in values, where string-keys option has integer (fixed-length) values, while integer-keys option has strings (chars-dependent length) values. But that may not always will be true due to possible collisions.
  • "String-numeric" keys, such as '4', will be treated as integer keys and translated into integer hash result as it was integer key. Thus, '4'=>'foo' and 4 => 'foo' are same things.

<子>的此外,重要提示的:图形这里有 PHP版权内部本书

Also, important note: the graphics here are copyright of PHP internals book

PHP数组和C数组

您应该明白一件非常重要的事情:PHP是在C,其中诸如关联数组根本不存在写入。所以,在C阵列是什么阵列是 - 即它在存储器只是一个连续区域可以由连续的偏移进行访问。您的钥匙可能只是数字,整数,只有连续,从0开始。你不能拥有,例如, 3 -6 '富'为您的钥匙在那里。

You should realize one very important thing: PHP is written on C, where such things as "associative array" simply does not exist. So, in C "array" is exactly what "array" is - i.e. it's just a consecutive area in memory which can be accessed by a consecutive offset. Your "keys" may be only numeric, integer and only consecutive, starting from zero. You can't have, for instance, 3,-6,'foo' as your "keys" there.

因此​​,要实现阵列,这是在PHP中,有哈希映射选项,它使用的杂凑函数的到的的钥匙,并将其转化为整数,它可以用于C-阵列。这个函数,但是,将永远无法字符串键和整数值之间创建一个双射散列结果。而且很容易理解为什么:因为设置字符串基数多,更大的整数基数组。让我们来举例说明用的例子:我们将各显神通所有字符串,最多长10,它只有字母数字符号(因此, 0-9 AZ AZ ,共有62个):它是62 10 总字符串可能。这是周围的 8.39E + 17 。周围的比较它4E + 9 这是我们对无符号整型(长整型,32位)类型,你会得到的想法 - 会有的碰撞

So to implement arrays, which are in PHP, there's hash-map option, it uses hash-function to hash your keys and transform them to integers, which can be used for C-arrays. That function, however, will never be able to create a bijection between string keys and their integer hashed results. And it's easy to understand why: because cardinality of strings set is much, much larger that cardinality of integer set. Let's illustrate with example: we'll recount all strings, up to length 10, which have only alphanumeric symbols (so, 0-9, a-z and A-Z, total 62): it's 6210 total strings possible. It's around 8.39E+17. Compare it with around 4E+9 which we have for unsigned integer (long integer, 32-bits) type and you'll get the idea - there will be collisions.

PHP哈希映射键和放大器;碰撞

现在,解决冲突,PHP只是放置物品,它们具有相同的散列函数的结果,成一个链表。因此,哈希地图也不会只是散列元素列表,而是将指针存储元素列表(在某些列表中的每个元素将有相同的散列功能键)。而这正是你点它会如何影响内存分配:如果你的数组有串钥匙,这并不会导致冲突,那么这些名单里面没有额外的指针将是必要的,所以内存量将减少(实际上,它是一个非常小的开销,但是,因为我们正在谈论的 precise 的内存分配,这应采取帐户)。而且,同样的道理,如果你的字符串键将导致成许多冲突,那么更多的额外的指针将创建的,所以总内存量会多一点。

Now, to resolve collisions, PHP will just place items, which have same hash-function result, into one linked list. So, hash-map would not be just "list of hashed elements", but instead it will store pointers to lists of elements (each element in certain list will have same hash-function key). And this is where you have point to how it will affect memory allocation: if your array has string keys, which did not result in collisions, then no additional pointers inside those list would be needed, so memory amount will be reduced (actually, it's a very small overhead, but, since we're talking about precise memory allocation, this should be taken to account). And, same way, if your string keys will result into many collisions, then more additional pointers would be created, so total memory amount will be a bit more.

要说明这些名单中的那些关系,这里有一个图表:

To illustrate those relations within those lists, here's a graphic:

上面有PHP将如何运用哈希函数之后解决冲突。所以你的问题的部分之一就出在这里,冲突解决列表内部指针。此外,链表的元素通常称为的水桶的和数组,其中包含指向这些列表的负责人在内部调用 arBuckets 。由于结构优化(所以,做出这样的事情元素缺失,速度更快),实时列表元素有两个指针,previous元素和下一个元素 - 但是这只会使内存量的非碰撞/碰撞阵列的区别小宽,但不会改变概念本身。

Above there is how PHP will resolve collisions after applying hash-function. So one of your question parts lies here, pointers inside collision-resolution lists. Also, elements of linked lists are usually called buckets and the array, which contains pointers to heads of those lists is internally called arBuckets. Due to structure optimization (so, to make such things as element deletion, faster), real list element has two pointers, previous element and next element - but that's only will make difference in memory amount for non-collision/collision arrays little wider, but won't change concept itself.

还有一个列表:为了

要全力支持数组,因为它们是在PHP中,它也需要保持的为了的,所以这是与另一内部列表实现。阵列中的每个元素是该列表中的一员了。它不会使差异内存分配方面,因为在这两种选项这份名单应该保持,但对于全貌,我提到这个名单。以下是图文:

To fully support arrays as they are in PHP, it's also needed to maintain order, so that is achieved with another internal list. Each element of arrays is a member of that list too. It won't make difference in terms of memory allocation, since in both options this list should be maintained, but for full picture, I'm mentioning this list. Here's the graphic:

除了 pListLast pListNext ,指针订购列表的头和尾都存储。再次,它不直接关系到你的问题,但进一步的,我会倾斗内部结构,这些指针是present。

In addition to pListLast and pListNext, pointers to order-list head and tail are stored. Again, it's not directly related to your question, but further I'll dump internal bucket structure, where these pointers are present.

这里面的数组元素

现在,我们准备考虑:什么是数组元素,所以,的的:

Now we're ready to look into: what is array element, so, bucket:

typedef struct bucket {
    ulong h;
    uint nKeyLength;
    void *pData;
    void *pDataPtr;
    struct bucket *pListNext;
    struct bucket *pListLast;
    struct bucket *pNext;
    struct bucket *pLast;
    char *arKey;
} Bucket;

下面,我们分别是:


  • ^ h 是一个整数(ULONG)键的值,它的哈希函数的结果。对于整数键就只是相同密钥本身(哈希函数返回本身)

  • pNext / PLAST 是内部冲突解决链表指针

  • pListNext / pListLast 是指针的秩序分辨率链表

  • 的pData 是指向存储的值。其实,在创建阵列插入值不一样,它的的复制的,但是,为了避免不必要的开销,PHP使用 pDataPtr (所以的pData =安培; pDataPtr

  • h is an integer (ulong) value of key, it's a result of hash-function. For integer keys it is just same as key itself (hash-function returns itself)
  • pNext / pLast are pointers inside collision-resolution linked list
  • pListNext/pListLast are pointers inside order-resolution linked list
  • pData is a pointer to the stored value. Actually, value isn't same as inserted at array creation, it's copy, but, to avoid unnecessary overhead, PHP uses pDataPtr (so pData = &pDataPtr)

从这个角度来看,你可能会得到接下来的事情,其中​​的区别是:因为字符串键将被散列(因此, ^ h 总是 ULONG ,因此,相同的大小),这将是什么被存储在值的问题。因此,对于你串密钥数组会有整数值,而对于整数键阵列会有字符串值,并且使差。然而 - 不,它不是一个魔术:你不能保存记忆与存储字符串键这样的方式所有的时间,因为如果你的钥匙会很大,会有很多人,就会造成碰撞的开销(当然,具有非常高的概率,但是,当然,不能保证)。它将工作只为任意的短字符串,这不会造成多次碰撞。

From this viewpoint, you may get next thing to where difference is: since string key will be hashed (thus, h is always ulong and, therefore, same size), it will be a matter of what is stored in values. So for your string-keys array there will be integer values, while for integer-keys array there will be string values, and that makes difference. However - no, it isn't a magic: you can't "save memory" with storing string keys such way all the times, because if your keys would be large and there will be many of them, it will cause collisions overhead (well, with very high probability, but, of course, not guaranteed). It will "work" only for arbitrary short strings, which won't cause many collisions.

哈希表本身

它已经谈过元素(桶)及其结构,也有哈希表本身,这是,事实上,数组数据结构。因此,它被称为 _hashtable

It's already been spoken about elements (buckets) and their structure, but there's also hash-table itself, which is, in fact, array data-structure. So, it's called _hashtable:

typedef struct _hashtable {
    uint nTableSize;
    uint nTableMask;
    uint nNumOfElements;
    ulong nNextFreeElement;
    Bucket *pInternalPointer;   /* Used for element traversal */
    Bucket *pListHead;
    Bucket *pListTail;
    Bucket **arBuckets;
    dtor_func_t pDestructor;
    zend_bool persistent;
    unsigned char nApplyCount;
    zend_bool bApplyProtection;
#if ZEND_DEBUG
    int inconsistent;
#endif
} HashTable;

我不会描述所有的字段,因为我已经提供了很多信息,这不仅关系到这个问题,但我会简要描述这种结构:

I won't describe all the fields, since I've already provided much info, which is only related to the question, but I'll describe this structure briefly:


  • arBuckets 就是上面所述,水桶储存,

  • pListHead / pListTail 是指向订购分辨率列表

  • nTableSize 确定哈希表的大小。这是直接关系到内存分配: nTableSize 始终为2。因此力量,这是没有问题,如果你有在阵列13或14的元素:实际大小会16.采取这一考虑,当你想估计数组的大小。

  • arBuckets is what was described above, the buckets storage,
  • pListHead/pListTail are pointers to order-resolution list
  • nTableSize determines size of hash-table. And this is directly related to memory allocation: nTableSize is always power of 2. Thus, it's no matter if you'll have 13 or 14 elements in array: actual size will be 16. Take that to account when you want to estimate array size.

这真的很难以predict,将一个阵列比另一个你的情况要大。是的,有哪些是从内部结构以下指导原则,但如果串钥匙是由它们的长度为整数值(相当于像一个人样品中) - 真正的差别将在这样的事情 - 多次碰撞是如何发生的,有多少字节被分配到保存该值

It's really difficult to predict, will one array be larger than another in your case. Yes, there are guidelines which are following from internal structure, but if string keys are comparable by their length to integer values (like 'four', 'one' in your sample) - real difference will be in such things as - how many collisions occurred, how many bytes were allocated to save the value.

但是,选择合适的结构应该是物质的意识,没有记忆。如果你的目的是要建立相​​应的索引数据,然后选择永远是显而易见的。上述职位只有一个目标:展示如何数组实际上是在PHP中工作,在那里你可以在你的样品中发现的内存分配的差异

But choosing proper structure should be matter of sense, not memory. If your intention is to build the corresponding indexed data, then choice always be obvious. Post above is only about one goal: to show how arrays actually work in PHP and where you can find the difference in memory allocation in your sample.

您还可以查看阵列和放物品;在PHP哈希表:它是在PHP中的哈希表 通过PHP内部的书:我从那里使用的一些图形。此外,为了实现,价值是如何在PHP中分配,检查 zval结构 的文章,它可以帮助您了解,这将是串和放大器之间的差异;整数分配您的阵列值。我不包括从它的解释在这里,因为对我来说更重要的一点 - 是显示阵列,什么可能是字符串键/整数键为你的问题的情况下区别

You may also check article about arrays & hash-tables in PHP: it's Hash-tables in PHP by PHP internals book: I've used some graphics from there. Also, to realize, how values are allocated in PHP, check zval Structure article, it may help you to understand, what will be differences between strings & integers allocation for values of your arrays. I didn't include explanations from it here, since much more important point for me - is to show array data structure and what may be difference in context of string keys/integer keys for your question.

这篇关于如何PHP内存的实际工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆