使用局部函数还是全局函数是否更优化? [英] Is it more optimized to use local function or global function?
问题描述
我想知道使用 local function (在 _drawBitmap
下面的示例中)是否更优化,该函数仅需要 3参数和不能内联,因为该函数访问某些所有者过程变量,或者使用可以内联的全局函数(但确实会真的内联吗?),这将需要 5个参数。
I want to know if it's more optimized to use a local function (in the exemple below _drawBitmap
) who require only 3 parameters and can't be inlined because the function access some owner procedure variables, or to use a global function that can be inlined (but does it will be really inlined?) and that will require 5 parameters.
也不知道它是否重要,但是该代码主要用于android / ios编译
also don't know if it's important but this code is mostly for android/ios compilation
本地函数:
procedure TMyObject.onPaint(Sender: TObject; Canvas: TCanvas; const ARect: TRectF);
function _drawBitmap(const aBitmap: {$IFDEF _USE_TEXTURE}TTexture{$ELSE}Tbitmap{$ENDIF}; const aTopLeft: TpointF; Const aOpacity: Single): boolean;
var aDestRect: TrectF;
begin
Result := False;
if aBitmap <> nil then begin
//calculate aDestRect
aDestRect := canvas.AlignToPixel(
TRectF.Create(
aTopLeft,
aBitmap.Width/ScreenScale,
aBitmap.Height/ScreenScale));
//if the aBitmap is visible
if ARect.IntersectsWith(aDestRect) then begin
Result := True;
{$IFDEF _USE_TEXTURE}
TCustomCanvasGpu(Canvas).DrawTexture(aDestRect, // ATexRect
TRectF.Create(0,
0,
aBitmap.Width,
aBitmap.Height), // ARect
ALPrepareColor(TCustomCanvasGpu.ModulateColor, aOpacity * AbsoluteOpacity), // https://quality.embarcadero.com/browse/RSP-15432
aBitmap);
{$ELSE}
Canvas.DrawBitmap(aBitmap, // ABitmap
TRectF.Create(0,
0,
aBitmap.Width,
aBitmap.Height), // SrcRect
aDestRect, // DstRect
aOpacity * AbsoluteOpacity, // AOpacity
samevalue(aDestRect.Width, aBitmap.Width, Tepsilon.Position) and
samevalue(aDestRect.height, aBitmap.height, Tepsilon.Position)); // HighSpeed - set interpolation to none
{$ENDIF};
end;
end;
end;
begin
_drawBitmap(aBitmap, aPos, 1);
end;
ASM:
MyObject.pas.2632: _drawBitmap(fBtnFilterBitmap, // aBitmap
00B97511 55 push ebp
00B97512 680000803F push $3f800000
00B97517 8B45F8 mov eax,[ebp-$08]
00B9751A 8D90C4050000 lea edx,[eax+$000005c4]
00B97520 8B45F8 mov eax,[ebp-$08]
00B97523 8B80A8040000 mov eax,[eax+$000004a8]
00B97529 E882FDFFFF call _drawBitmap
00B9752E 59 pop ecx
MyObject.pas.2562: begin
00B972B0 55 push ebp
00B972B1 8BEC mov ebp,esp
00B972B3 83C4A0 add esp,-$60
00B972B6 53 push ebx
00B972B7 56 push esi
00B972B8 57 push edi
00B972B9 8955FC mov [ebp-$04],edx
00B972BC 8BF0 mov esi,eax
MyObject.pas.2563: Result := False;
00B972BE 33DB xor ebx,ebx
MyObject.pas.2564: if aBitmap <> nil then begin
00B972C0 85F6 test esi,esi
00B972C2 0F84B4010000 jz $00b9747c
MyObject.pas.2567: aDestRect := canvas.AlignToPixel(
00B972C8 8B450C mov eax,[ebp+$0c]
00B972CB 8B78FC mov edi,[eax-$04]
00B972CE 8BC6 mov eax,esi
00B972D0 E88F559BFF call TBitmap.GetWidth
...
并具有全局功能:
function drawBitmap(const Canvas: TCanvas; const ARect: TRectF; const aBitmap: {$IFDEF _USE_TEXTURE}TTexture{$ELSE}Tbitmap{$ENDIF}; const aTopLeft: TpointF; Const aOpacity: Single): boolean; inline;
var aDestRect: TrectF;
begin
Result := False;
if aBitmap <> nil then begin
//calculate aDestRect
aDestRect := canvas.AlignToPixel(
TRectF.Create(
aTopLeft,
aBitmap.Width/ScreenScale,
aBitmap.Height/ScreenScale));
//if the aBitmap is visible
if ARect.IntersectsWith(aDestRect) then begin
Result := True;
{$IFDEF _USE_TEXTURE}
TCustomCanvasGpu(Canvas).DrawTexture(aDestRect, // ATexRect
TRectF.Create(0,
0,
aBitmap.Width,
aBitmap.Height), // ARect
ALPrepareColor(TCustomCanvasGpu.ModulateColor, aOpacity * AbsoluteOpacity), // https://quality.embarcadero.com/browse/RSP-15432
aBitmap);
{$ELSE}
Canvas.DrawBitmap(aBitmap, // ABitmap
TRectF.Create(0,
0,
aBitmap.Width,
aBitmap.Height), // SrcRect
aDestRect, // DstRect
aOpacity * AbsoluteOpacity, // AOpacity
samevalue(aDestRect.Width, aBitmap.Width, Tepsilon.Position) and
samevalue(aDestRect.height, aBitmap.height, Tepsilon.Position)); // HighSpeed - set interpolation to none
{$ENDIF};
end;
end;
end;
procedure TMyObject.onPaint(Sender: TObject; Canvas: TCanvas; const ARect: TRectF);
begin
drawBitmap(aBitmap, aPos, 1);
end;
ASM:
MyObject.pas.2636: drawBitmap(Canvas, aRect, fBtnFilterBitmap, // aBitmap
00B98F6D 8BFB mov edi,ebx
00B98F6F 8B83A8040000 mov eax,[ebx+$000004a8]
00B98F75 8945F0 mov [ebp-$10],eax
00B98F78 8D83C4050000 lea eax,[ebx+$000005c4]
00B98F7E 8945EC mov [ebp-$14],eax
00B98F81 C645EB00 mov byte ptr [ebp-$15],$00
00B98F85 8B75F0 mov esi,[ebp-$10]
00B98F88 85F6 test esi,esi
00B98F8A 0F840A020000 jz $00b9919a
00B98F90 8BC6 mov eax,esi
00B98F92 E8CD389BFF call TBitmap.GetWidth
...
推荐答案
在这里,就使用VCL TCanvas
而言,调用该函数将是即时的,因此显然是过早的优化,并且在实践中两者之间没有性能差异。二。全局功能可能更难o维护(除非某些代码实际上可以在单元中的其他地方重用)。无论如何,即使是全局函数也不是一个好主意:如果您有一些特定的可重用过程,请定义一个 class
:它将更加干净,并且更易于调试/扩展/测试
Here, calling the function will be instantly, in respect to using the VCL TCanvas
. So it is clearly premature optimization, and there is no performance difference in practice between the two. The global function may be more difficult to maintain (unless it is some code which can be actually be reused somewhere else in the unit). Anyway, even a global function is not a good idea: if you have some specific reusable process, define a class
instead: it will be cleaner and easier to debug/extend/test.
仅对于非常小的函数(不调用任何其他函数),内联可能会带来一些性能上的好处。例如:
Only for very small functions, which do not call any other functions, inlining may give some performance benefits. For instance:
function Add(n1,n2: integer): integer; inline;
begin
result := n1 + n2;
end;
但是在您的情况下,这没有任何意义。
But in your case, it won't make any sense.
而且,正如您所说,是由编译器实际内联还是不内联。如果声明内联不会带来任何好处(它甚至可能比子函数要慢),也不会内联该函数。
And, as you stated, it is up to the compiler to actually inline the asm, or not. If it states that inlining won't make any benefit (it may even be slower than a sub-function), it won't inline the function.
为了完整性,在低汇编级别下,当您在另一个函数中调用本地函数时,将添加调用方堆栈框架指针作为附加参数来访问范围中的变量。
For completeness, at low asssembly level, when you call a local function within another function, access to the variables in the scope is done adding the caller "stack frame" pointer as an additional parameter.
在伪代码中是这样的:
function _drawBitmap(const stackframe: TLocalStackRecord; const aBitmap: {$IFDEF _USE_TEXTURE}TTexture{$ELSE}Tbitmap{$ENDIF}; const aTopLeft: TpointF; Const aOpacity: Single): boolean;
var aDestRect: TrectF;
begin
Result := False;
if aBitmap <> nil then begin
//calculate aDestRect
aDestRect := stackframe.canvas.AlignToPixel(
TRectF.Create(
aTopLeft,
aBitmap.Width/ScreenScale,
aBitmap.Height/ScreenScale));
...
尝试避免过早优化:
程序员浪费大量时间来思考或担心
程序非关键部分的速度,而这些
实际上是在提高效率当考虑
调试和维护时,会产生严重的负面影响。我们应该忘掉
的小效率,大约有97%的时间是这样:过早的优化是万恶之源。然而,我们不应该放弃我们那关键的3%的机会。
Variant in Knuth,使用Goto语句进行结构化编程。计算调查6:4(1974年12月),第261–301页,第1节。
为避免浪费您的时间(和金钱),使用分析器-例如 Eric的采样分析器-找出实际上需要优化代码的哪一部分。
To avoid wasting your time (and money), use a profiler - e.g. Eric's Sampling Profiler - to find out which part of your code will actually need to be optimized.
正确处理,然后快速处理。并使其始终可读取和可维护。
Make it right, then make it fast. And make it always readeable and maintainable.
这篇关于使用局部函数还是全局函数是否更优化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!