Lua表的一个有趣现象 [英] An Interesting phenomenon of Lua's table

查看:26
本文介绍了Lua表的一个有趣现象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Lua 的新手,最近我正在学习 table 的用法.从教程中我知道Lua对数字索引项和非数字索引项的处理方式不同,所以我自己做了一些测试,今天发现了一个有趣的现象,我无法解释:

I'm new to Lua, and I'm learning the usage of table these days. From tutorials I know that Lua treats numeric indexed items and non-numeric indexed items differently, so I did some tests myself, and today I found an interesting phenomenon and I can't explain it:

代码

t = {1, 2, 3, a='a', b='b'}
print(#t)

得到

3

因为 # 运算符只计算数字索引项.然后我测试了下面的代码

because the # operator counts numeric indexed items only. Then I tested the following code

t = {1, 2, 3, a='a', b='b'}
print(#t)

for i = 100,200 do
    t[i] = i
end
print(#t)

我明白

3
3

到目前为止,我认为 Lua 将那些后来添加的不连续项视为非数字索引项.但是,在我稍微更改代码之后

Till now I think Lua treats those discontinuous items added later as non-numeric indexed ones. However, after I change the code a little bit

t = {1, 2, 3, a='a', b='b'}
print(#t)

for i = 100,300 do
    t[i] = i
end
print(#t)

我明白

3
300

我对这种现象感到困惑,有人知道原因吗?谢谢.

I'm confused by this phenomenon, does anyone know the reason? Thanks.

(此现象可在http://www.lua.org/cgi-bin/演示)

我试过这个代码

t = {1, 2, 3, a='a', b='b'}
print(#t)

for i = 100,300 do
    t[i] = i
    print("add", i, #t)
end

for i = 100,300 do
    t[i] = nil
    print("del", i, #t)
end

我明白

3
add 100 3
add 101 3
add 102 3
...
add 223 3
add 224 3
add 225 3
add 226 226
add 227 227
add 228 228
...
add 298 298
add 299 299
add 300 300
del 100 300
del 101 300
del 102 300
...
del 253 300
del 254 300
del 255 300
del 256 3
del 257 3
del 258 3
...
del 298 3
del 299 3
del 300 3

这个例子表明 Lua 确实在稀疏和密集之间转换了一个表.

This example shows that Lua does convert a table between sparse and dense.

推荐答案

我还没有研究 # 运算符是如何实现的,但我敢打赌,通过添加额外的 100索引,您已经使范围 1-300 变得足够密集,以至于索引 100-300 最终出现在表实现的数组"部分而不是哈希"部分.

I haven't looked at how the # operator is implemented, but I bet what's going on is that by adding the extra 100 indexes, you've caused the range 1-300 to become dense enough that the indexes 100-300 end up in the "array" part of the table implementation instead of the "hash" part.

更新:

好的,我查看了原始表长度的来源.如果数组部分的最后一个条目是 nil,它会二分搜索数组以找到最低的边界"(一个非 nil 索引后跟一个 nil 索引).如果它不是 nil,它决定边界必须在哈希中并搜索它.

Ok, I looked at the source for the primitive table length. If the final entry in the array part is nil, it binary-searches the array to find the lowest "boundary" (a non-nil index followed by a nil index). If it's not nil, it decides the boundary must be in the hash and searches for it.

所以对于包含数字索引 {1, 2, 3, 100..200} 的表,我认为它不够密集,数组部分只包含 {1, 2,3}.但是对于包含 {1, 2, 3, 100..300} 的表,它可能足够密集,以至于数组部分在 100..300 部分(我认为数组部分总是 2 的幂,所以它不可能以 300 结束,但我不是 100% 肯定).

So with the table containing numeric indexes {1, 2, 3, 100..200}, I assume it's not dense enough and the array part only contains {1, 2, 3}. But with the table containing {1, 2, 3, 100..300}, it's presumably dense enough that the array part ends somewhere within the 100..300 part (I think the array part is always a power of 2, so it can't possibly end at 300, but I'm not 100% positive).

更新 2:

当一个 lua 表被重新散列时,它会计算整数键的数量.然后它遍历所有不超过整数键数量两倍的 2 的幂,并找到至少 50% 密集的 2 的最大幂(这意味着如果数组部分这么大,至少 50%的所有值都不是 nil).

When a lua table is rehashed, it counts the number of integer keys. It then walks up all the powers of two that are no more than twice the number of integral keys, and finds the largest power of two that is at least 50% dense (meaning that if the array part were this large, at least 50% of all values would be non-nil).

所以使用 {1, 2, 3, 100..200},它会向上走

So with {1, 2, 3, 100..200}, it walks up

1: 100% dense; good
2: 100% dense; good
4: 75% dense; bad
8: 37.5% dense; bad
16: 18.75% dense; bad
32: 9.375% dense; bad
64: 4.6875% dense; bad
128: 25% dense; bad
256: 40.625% dense; bad

最好的值是 2,所以它最终的数组大小为 2.由于 2 非零,它搜索边界的哈希并找到 3.

The best good value is 2, so it ends up with an array size of 2. Since 2 is non-nil, it searches the hash for the boundary and finds 3.

一旦你添加了201..300,最后一步就变成了

Once you add 201..300 the last step becomes

256: 62.5% dense; good

导致数组部分覆盖 1..256,并且由于 256 非零,它再次搜索哈希中的边界并得到 300`.

which causes the array part to cover 1..256, and since 256 is non-nil, it again searches for the boundary in the hash and gets 300`.

最后,Lua 5.2 将序列"定义为一个表,其中包含从 1 开始并没有孔的唯一整数键.并且它将 # 定义为仅对序列有效.这样,Lua 就可以摆脱您注意到的那些在其整数序列中有漏洞的表的奇怪行为.

In the end, Lua 5.2 defines a "sequence" as a table with exclusively integral keys starting at 1 and going up with no holes. And it defines # as only being valid for sequences. This way Lua can get away with the weird behavior you noticed for tables that have holes in their integral sequences.

这篇关于Lua表的一个有趣现象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆