Lua 是否优化了“.."?操作员? [英] Does Lua optimize the ".." operator?

查看:16
本文介绍了Lua 是否优化了“.."?操作员?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须执行以下代码:

local filename = dir .. "/" .. base

循环数千次(这是一个打印目录树的递归).

现在,我想知道 Lua 是否一次性连接了 3 个字符串(dir、"/"、base)(即,通过分配一个足够长的字符串来容纳它们的总长度),或者它是否通过这样做以低效的方式执行此操作它在内部分为两步:

local filename = (dir .. "/") -- step1.. 基础——第 2 步

最后一种方法在内存方面效率低下,因为分配了两个字符串而不是一个.

我不太关心 CPU 周期:我主要关心内存消耗.

最后,让我概括一下问题:

Lua在执行下面的代码时,是只分配一个字符串,还是4个?

local 结果 = str1 .. str2 .. str3 .. str4 .. str5

顺便说一句,我知道我可以做到:

local filename = string.format("%s/%s", dir, base)

但我还没有对其进行基准测试(内存和 CPU 方面).

(顺便说一句,我知道 table:concat().这会增加创建表的开销,所以我想它不会在所有用例中都有益.)

一个额外的问题:

如果 Lua 没有优化.."操作符,定义一个 C 函数来连接字符串是否是个好主意,例如utils.concat(dir, "/", base, ".", extension)?

解决方案

虽然 Lua 对 .. 的用法做了简单的优化,但在紧密循环中使用它还是要小心,尤其是当加入非常大的字符串,因为这会产生大量垃圾,从而影响性能.

连接多个字符串的最佳方式是使用 table.concat.

table.concat 允许您使用表作为所有要连接的字符串的临时缓冲区,并且仅在您完成将字符串添加到缓冲区后才执行连接,如下面的愚蠢示例:

local buf = {}对于 i = 1, 10000 做buf[#buf+1] = get_a_string_from_somewhere()结尾本地 final_string = table.concat( buf )

<小时>

..的简单优化可以通过分析以下脚本的反汇编字节码看出:

-- 文件lua_06.lua"本地 a = "你好"本地 b = "残忍"本地 c = "世界"本地 z = a .. " " .. b .. " " .. c打印(z)

luac -l -p lua_06.lua 的输出如下(对于 Lua 5.2.2):

<前>主要(003E40A0 处的 13 条指令)0+ 参数、8 个槽、1 个上值、4 个局部变量、5 个常量、0 个函数1 [3] 负载 0 -1 ;你好"2 [4] 负载 1 -2 ;残忍的"3 [5] 负载 2 -3 ;世界"4 [7] 移动 3 05 [7] 负载 4 -4 ;""6 [7] 移动 5 17 [7] 负载 6 -4 ;""8 [7] 移动 7 29 [7] 康卡特 3 3 710 [9] 获取 4 0 -5 ;_ENV "打印"11 [9] 移动 5 312 [9] 呼叫 4 2 113 [9] 返回 0 1

您可以看到只生成了一个 CONCAT 操作码,尽管脚本中使用了许多 .. 操作符.

<小时>

要完全理解何时使用 table.concat,您必须知道 Lua 字符串是不可变的.这意味着每当您尝试连接两个字符串时,您实际上都是在创建一个新字符串(除非结果字符串已被解释器插入,但这通常不太可能).例如,考虑以下片段:

local s = s .. 你好"

并假设 s 已经包含一个巨大的字符串(比如 10MB).执行该语句会创建一个新字符串(10MB + 5 个字符)并丢弃旧字符串.所以你刚刚为垃圾收集器创建了一个 10MB 的死对象来处理.如果你反复这样做,你最终会占用垃圾收集器.这是 .. 的真正问题,这是典型用例,需要收集表中最终字符串的所有部分并使用 table.concat 关于它:这不会避免垃圾的产生(在调用 table.concat 之后所有的碎片将是垃圾),但是你会大大减少不必要的垃圾.

<小时>

结论

  • 每当您连接几个可能很短的字符串,或者您没有处于紧密循环中时,请使用 ...在这种情况下,table.concat 可能会给您带来更差的性能,因为:
    • 你必须创建一个表(通常你会扔掉它);
    • 您必须调用函数 table.concat(函数调用开销比使用内置的 .. 运算符对性能的影响更大)几次).
  • 使用table.concat,如果需要连接多个字符串,尤其是满足以下一个或多个条件时:
    • 您必须在后续步骤中执行此操作(.. 优化仅在同一表达式内起作用);
    • 你陷入了困境;
    • 字符串很大(例如,几个 kB 或更多).

请注意,这些只是经验法则.在性能真正重要的地方,您应该分析您的代码.

无论如何,Lua 在处理字符串时与其他脚本语言相比是相当快的,所以通常你不需要太在意.

I have to execute the following code:

local filename = dir .. "/" .. base

thousands of times in a loop (it's a recursion that prints a directory tree).

Now, I wonder whether Lua concatenates the 3 strings (dir, "/", base) in one go (i.e., by allocating a string long enough to hold their total lengths) or whether it does this the inefficient way by doing it internally in two steps:

local filename = (dir .. "/")              -- step1
                               .. base     -- step2

This last way would be inefficient memory-wise because two strings are allocated instead of just one.

I don't care much about CPU cycles: I care mainly about memory consumption.

Finally, let me generalize the question:

Does Lua allocate only one string, or 4, when it executes the following code?

local result = str1 .. str2 .. str3 .. str4 .. str5

BTW, I know that I could do:

local filename = string.format("%s/%s", dir, base)

But I've yet to benchmark it (memory & CPU wise).

(BTW, I know about table:concat(). This has the added overhead of creating a table so I guess it won't be beneficial in all use cases.)

A bonus question:

In case Lua doesn't optimize the ".." operator, would it be a good idea to define a C function for concatenating strings, e.g. utils.concat(dir, "/", base, ".", extension)?

解决方案

Although Lua performs a simple optimization on .. usage, you should still be careful to use it in a tight loop, especially when joining very large strings, because this will create lots of garbage and thus impact performance.

The best way to concatenate many strings is with table.concat.

table.concat lets you use a table as a temporary buffer for all the strings to be concatenated and perform the concatenation only when you are done adding strings to the buffer, like in the following silly example:

local buf = {}
for i = 1, 10000 do
    buf[#buf+1] = get_a_string_from_somewhere()
end
local final_string = table.concat( buf )


The simple optimization for .. can be seen analyzing the disassembled bytecode of the following script:

-- file "lua_06.lua"

local a = "hello"
local b = "cruel"
local c = "world"

local z = a .. " " .. b .. " " .. c

print(z)

the output of luac -l -p lua_06.lua is the following (for Lua 5.2.2):

main  (13 instructions at 003E40A0)
0+ params, 8 slots, 1 upvalue, 4 locals, 5 constants, 0 functions
    1   [3] LOADK       0 -1    ; "hello"
    2   [4] LOADK       1 -2    ; "cruel"
    3   [5] LOADK       2 -3    ; "world"
    4   [7] MOVE        3 0
    5   [7] LOADK       4 -4    ; " "
    6   [7] MOVE        5 1
    7   [7] LOADK       6 -4    ; " "
    8   [7] MOVE        7 2
    9   [7] CONCAT      3 3 7
    10  [9] GETTABUP    4 0 -5  ; _ENV "print"
    11  [9] MOVE        5 3
    12  [9] CALL        4 2 1
    13  [9] RETURN      0 1

You can see that only a single CONCAT opcode is generated, although many .. operators are used in the script.


To fully understand when to use table.concat you must know that Lua strings are immutable. This means that whenever you try to concatenate two strings you are indeed creating a new string (unless the resulting string is already interned by the interpreter, but this is usually unlikely). For example, consider the following fragment:

local s = s .. "hello"

and assume that s already contains a huge string (say, 10MB). Executing that statement creates a new string (10MB + 5 characters) and discards the old one. So you have just created a 10MB dead object for the garbage collector to cope with. If you do this repeatedly you end up hogging the garbage collector. This is the real problem with .. and this is the typical use case where it is necessary to collect all the pieces of the final string in a table and to use table.concat on it: this won't avoid the generation of garbage (all the pieces will be garbage after the call to table.concat), but you will greatly reduce unnecessary garbage.


Conclusions

  • Use .. whenever you concatenate few, possibly short, strings, or you are not in a tight loop. In this case table.concat could give you worse performance because:
    • you must create a table (which usually you would throw away);
    • you have to call the function table.concat (the function call overhead impacts performance more than using the built-in .. operator a few times).
  • Use table.concat, if you need to concatenate many strings, especially if one or more of the following conditions are met:
    • you must do it in subsequent steps (the .. optimization works only inside the same expression);
    • you are in a tight loop;
    • the strings are large (say, several kBs or more).

Note that these are just rules of thumb. Where performance is really paramount you should profile your code.

Anyway Lua is quite fast compared with other scripting languages when dealing with strings, so usually you don't need to care so much.

这篇关于Lua 是否优化了“.."?操作员?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆