我可以做些什么来提高 Lua 程序的性能? [英] What can I do to increase the performance of a Lua program?

查看:17
本文介绍了我可以做些什么来提高 Lua 程序的性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我问了一个关于 Lua 性能的问题,以及关于 回复问:

I asked a question about Lua perfromance, and on of the responses asked:

您是否研究过保持 Lua 高性能的一般技巧?即知道创建表并重用一个表而不是创建一个新表,使用local print=print"等来避免全局访问.

Have you studied general tips for keeping Lua performance high? i.e. know table creation and rather reuse a table than create a new one, use of 'local print=print' and such to avoid global accesses.

这是一个与 Lua 模式、提示和技巧 的问题,因为我会喜欢具体影响性能的答案,以及(如果可能)解释性能受到影响的原因.

This is a slightly different question from Lua Patterns,Tips and Tricks because I'd like answers that specifically impact performance and (if possible) an explanation of why performance is impacted.

每个答案一个提示是理想的.

One tip per answer would be ideal.

推荐答案

回应其他一些回答和评论:

In response to some of the other answers and comments:

确实,作为程序员,您通常应该避免过早的优化.但是.对于编译器没有进行太多优化或根本没有优化的脚本语言,情况并非如此.

It is true that as a programmer you should generally avoid premature optimization. But. This is not so true for scripting languages where the compiler does not optimize much -- or at all.

因此,每当您在 Lua 中编写一些经常执行的内容时,在时间要求严格的环境中运行或可能运行一段时间时,了解要避免(并避免它们).

So, whenever you write something in Lua, and that is executed very often, is run in a time-critical environment or could run for a while, it is a good thing to know things to avoid (and avoid them).

这是我随着时间的推移发现的集合.其中一些是我通过网络发现的,但是当涉及到 互联网 时,我自己测试了所有这些.另外,我已经阅读了 Lua.org 上的 Lua 性能论文.

This is a collection of what I found out over time. Some of it I found out over the net, but being of a suspicious nature when the interwebs are concerned I tested all of it myself. Also, I have read the Lua performance paper at Lua.org.

一些参考:

这是最常见的提示之一,但再次说明也无妨.

This is one of the most common hints, but stating it once more can't hurt.

全局变量按其名称存储在哈希表中.访问它们意味着您必须访问表索引.虽然 Lua 有一个很好的哈希表实现,但它仍然比访问局部变量慢很多.如果必须使用全局变量,请将它们的值分配给局部变量,这在第二次变量访问时会更快.

Globals are stored in a hashtable by their name. Accessing them means you have to access a table index. While Lua has a pretty good hashtable implementation, it's still a lot slower than accessing a local variable. If you have to use globals, assign their value to a local variable, this is faster at the 2nd variable access.

do
  x = gFoo + gFoo;
end
do -- this actually performs better.
  local lFoo = gFoo;
  x = lFoo + lFoo;
end

(不是那么简单的测试可能会产生不同的结果.例如 local x; for i=1, 1000 do x=i; end 这里 for 循环头实际上比循环体花费更多的时间,因此分析结果可能会失真.)

(Not that simple testing may yield different results. eg. local x; for i=1, 1000 do x=i; end here the for loop header takes actually more time than the loop body, thus profiling results could be distorted.)

Lua 在创建时对所有字符串进行哈希处理,这使得在表中比较和使用它们的速度非常快,并减少了内存使用,因为所有字符串仅在内部存储一次.但它使创建字符串的成本更高.

Lua hashes all strings on creation, this makes comparison and using them in tables very fast and reduces memory use since all strings are stored internally only once. But it makes string creation more expensive.

避免过度创建字符串的一个流行选项是使用表格.例如,如果您必须组装一个长字符串,创建一个表,将单个字符串放入其中,然后使用table.concat 将其加入一次

A popular option to avoid excessive string creation is using tables. For example, if you have to assemble a long string, create a table, put the individual strings in there and then use table.concat to join it once

-- do NOT do something like this
local ret = "";
for i=1, C do
  ret = ret..foo();
end

如果 foo() 只返回字符 A,这个循环将创建一系列字符串,如 "", "A""AA""AAA" 等.每个字符串都将被散列并驻留在内存中,直到应用程序完成——看问题在这里?

If foo() would return only the character A, this loop would create a series of strings like "", "A", "AA", "AAA", etc. Each string would be hashed and reside in memory until the application finishes -- see the problem here?

-- this is a lot faster
local ret = {};
for i=1, C do
  ret[#ret+1] = foo();
end
ret = table.concat(ret);

这个方法在循环过程中根本不创建字符串,字符串是在函数foo中创建的,并且只将引用复制到表中.之后,concat 创建第二个字符串 "AAAAAA..."(取决于 C 的大小).请注意,您可以使用 i 而不是 #ret+1 但通常您没有这样有用的循环您可以使用的迭代器变量.

This method does not create strings at all during the loop, the string is created in the function foo and only references are copied into the table. Afterwards, concat creates a second string "AAAAAA..." (depending on how large C is). Note that you could use i instead of #ret+1 but often you don't have such a useful loop and you won't have an iterator variable you can use.

我在 lua-users.org 某处发现的另一个技巧是,如果您必须解析字符串,请使用 gsub

Another trick I found somewhere on lua-users.org is to use gsub if you have to parse a string

some_string:gsub(".", function(m)
  return "A";
end);

这起初看起来很奇怪,好处是 gsub 在 C 中立即"创建了一个字符串,只有在 gsub 返回时将其传递回 lua 后才会对其进行散列.这避免了表的创建,但可能有更多的函数开销(如果你无论如何调用 foo() ,但如果 foo() 实际上是一个表达式)

This looks odd at first, the benefit is that gsub creates a string "at once" in C which is only hashed after it is passed back to lua when gsub returns. This avoids table creation, but possibly has more function overhead (not if you call foo() anyway, but if foo() is actually an expression)

尽可能使用语言结构而不是函数

Use language constructs instead of functions where possible

当迭代一个表时,来自 ipairs 的函数开销并不能证明它的使用是合理的.要迭代表,请改用

When iterating a table, the function overhead from ipairs does not justify it's use. To iterate a table, instead use

for k=1, #tbl do local v = tbl[k];

它在没有函数调用开销的情况下完全相同(pairs 实际上返回另一个函数,然后为表中的每个元素调用该函数,而 #tbl 只计算一次).即使您需要价值,它也快得多.如果你不...

It does exactly the same without the function call overhead (pairs actually returns another function which is then called for every element in the table while #tbl is only evaluated once). It's a lot faster, even if you need the value. And if you don't...

Lua 5.2 的注意事项:在 5.2 中,您实际上可以在元表中定义一个 __ipairs 字段,它确实使 ipairs 在某些情况下很有用.然而,Lua 5.2 也使 __len 字段适用于表,所以你可能仍然更喜欢上面的代码而不是 ipairs ,然后是 __len 元方法只被调用一次,而对于 ipairs,每次迭代你会得到一个额外的函数调用.

Note for Lua 5.2: In 5.2 you can actually define a __ipairs field in the metatable, which does make ipairs useful in some cases. However, Lua 5.2 also makes the __len field work for tables, so you might still prefer the above code to ipairs as then the __len metamethod is only called once, while for ipairs you would get an additional function call per iteration.

table.inserttable.remove 的简单使用可以用 # 操作符代替.基本上这是用于简单的推送和弹出操作.以下是一些示例:

Simple uses of table.insert and table.remove can be replaced by using the # operator instead. Basically this is for simple push and pop operations. Here are some examples:

table.insert(foo, bar);
-- does the same as
foo[#foo+1] = bar;

local x = table.remove(foo);
-- does the same as
local x = foo[#foo];
foo[#foo] = nil;

对于移位(例如 table.remove(foo, 1)),如果不希望以稀疏表结束,当然最好使用表函数.

For shifts (eg. table.remove(foo, 1)), and if ending up with a sparse table is not desirable, it is of course still better to use the table functions.

您可能 - 也可能没有 - 在您的代码中做出如下决定

You might - or might not - have decisions in your code like the following

if a == "C" or a == "D" or a == "E" or a == "F" then
   ...
end

现在这是一个完全有效的案例,但是(根据我自己的测试)从 4 个比较开始并排除表生成,这实际上更快:

Now this is a perfectly valid case, however (from my own testing) starting with 4 comparisons and excluding table generation, this is actually faster:

local compares = { C = true, D = true, E = true, F = true };
if compares[a] then
   ...
end

而且由于哈希表具有恒定的查找时间,因此每进行一次额外的比较,性能增益就会增加.另一方面,如果大部分时间"一两个比较匹配,则使用布尔方式或组合可能会更好.

And since hash tables have constant look up time, the performance gain increases with every additional comparison. On the other hand if "most of the time" one or two comparisons match, you might be better off with the Boolean way or a combination.

这在 Lua 性能技巧中有详细讨论.基本上问题是 Lua 按需分配你的表,这样做实际上比清理它的内容并再次填充它需要更多的时间.

This is discussed thoroughly in Lua Performance Tips. Basically the problem is that Lua allocates your table on demand and doing it this way will actually take more time than cleaning it's content and filling it again.

然而,这有点问题,因为 Lua 本身并没有提供从表中删除所有元素的方法,而且 pairs() 本身并不是性能野兽.我自己还没有对这个问题做过任何性能测试.

However, this is a bit of a problem, since Lua itself does not provide a method for removing all elements from a table, and pairs() is not the performance beast itself. I have not done any performance testing on this problem myself yet.

如果可以的话,定义一个清表的C函数,这应该是一个很好的表复用解决方案.

If you can, define a C function that clears a table, this should be a good solution for table reuse.

我认为这是最大的问题.虽然使用非解释性语言的编译器可以轻松优化掉大量冗余,但 Lua 不会.

This is the biggest problem, I think. While a compiler in a non-interpreted language can easily optimize away a lot of redundancies, Lua will not.

使用表可以很容易地在 Lua 中完成.对于单参数函数,您甚至可以用表和 __index 元方法替换它们.尽管这会破坏透明度,但由于减少了一个函数调用,因此缓存值的性能更好.

Using tables this can be done quite easily in Lua. For single-argument functions you can even replace them with a table and __index metamethod. Even though this destroys transparancy, performance is better on cached values due to one less function call.

这是使用元表的单个参数的记忆化实现.(重要提示:此变体支持 nil 值参数,但对于现有值来说非常快.)

Here is an implementation of memoization for a single argument using a metatable. (Important: This variant does not support a nil value argument, but is pretty damn fast for existing values.)

function tmemoize(func)
    return setmetatable({}, {
        __index = function(self, k)
            local v = func(k);
            self[k] = v
            return v;
        end
    });
end
-- usage (does not support nil values!)
local mf = tmemoize(myfunc);
local v  = mf[x];

您实际上可以为多个输入值修改此模式

You could actually modify this pattern for multiple input values

这个想法类似于memoization,即缓存"结果.但是这里不是缓存函数的结果,而是通过将中间值的计算放在定义其块中计算函数的构造函数中来缓存中间值.实际上,我只是称之为对闭包的巧妙使用.

The idea is similar to memoization, which is to "cache" results. But here instead of caching the results of the function, you would cache intermediate values by putting their calculation in a constructor function that defines the calculation function in it's block. In reality I would just call it clever use of closures.

-- Normal function
function foo(a, b, x)
    return cheaper_expression(expensive_expression(a,b), x);
end
-- foo(a,b,x1);
-- foo(a,b,x2);
-- ...

-- Partial application
function foo(a, b)
    local C = expensive_expression(a,b);
    return function(x)
        return cheaper_expression(C, x);
    end
end
-- local f = foo(a,b);
-- f(x1);
-- f(x2);
-- ...

通过这种方式,可以轻松创建灵活的函数来缓存他们的一些工作,而不会对程序流程产生太大影响.

This way it is possible to easily create flexible functions that cache some of their work without too much impact on program flow.

一种极端的变体是柯里化,但这实际上更像是一种模仿函数式编程的方式,而不是其他任何东西.

An extreme variant of this would be Currying, but that is actually more a way to mimic functional programming than anything else.

这是一个更广泛的(现实世界")示例,其中有一些代码遗漏,否则很容易占据整个页面(即 get_color_values 实际上做了很多值检查并识别接受混合值)

Here is a more extensive ("real world") example with some code omissions, otherwise it would easily take up the whole page here (namely get_color_values actually does a lot of value checking and recognizes accepts mixed values)

function LinearColorBlender(col_from, col_to)
    local cfr, cfg, cfb, cfa = get_color_values(col_from);
    local ctr, ctg, ctb, cta = get_color_values(col_to);
    local cdr, cdg, cdb, cda = ctr-cfr, ctg-cfg, ctb-cfb, cta-cfa;
    if not cfr or not ctr then
        error("One of given arguments is not a color.");
    end

    return function(pos)
        if type(pos) ~= "number" then
            error("arg1 (pos) must be in range 0..1");
        end
        if pos < 0 then pos = 0; end;
        if pos > 1 then pos = 1; end;
        return cfr + cdr*pos, cfg + cdg*pos, cfb + cdb*pos, cfa + cda*pos;
    end
end
-- Call 
local blender = LinearColorBlender({1,1,1,1},{0,0,0,1});
object:SetColor(blender(0.1));
object:SetColor(blender(0.3));
object:SetColor(blender(0.7));

您可以看到,一旦创建了 Blender,该函数只需检查单个值而不是最多八个值.我什至提取了差异计算,虽然它可能没有太大改进,但我希望它展示了这种模式试图实现的目标.

You can see that once the blender was created, the function only has to sanity-check a single value instead of up to eight. I even extracted the difference calculation, though it probably does not improve a lot, I hope it shows what this pattern tries to achieve.

这篇关于我可以做些什么来提高 Lua 程序的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆