使用“const”教条还是理性? [英] Is the use of `const` dogmatic or rational?
问题描述
const
来加快代码的速度,例如 function A(const AStr:string):integer;
//或
函数B(AStr:string):integer;
假设两个函数内部都有相同的代码,它们之间的速度差可以忽略不计,我怀疑它可以甚至可以用循环计数器来衡量,如:
函数RDTSC:comp;
var
TimeStamp:记录案例字节
1:(Whole:comp);
2:(Lo,Hi:Longint);
结束
begin
asm
db $ 0F; db $ 31;
mov [TimeStamp.Lo],eax
mov [TimeStamp.Hi],edx
end;
结果:= TimeStamp.Whole;
结束
原因是所有的 const
AStr 增加。
但增量只需要一个我的多核的一个核心周期CPU,所以...
为什么要打扰 const
?
如果函数没有其他原因包含一个隐含的try / finally,并且函数本身没有做太多的工作,使用const可以导致一个显着的加速(我曾经得到一个功能,使用> 10%的总运行时间在一个剖析运行中,只有通过在正确的位置添加一个const来<2%)
$ b $另外,引用计数比一个循环要多得多,因为它必须使用锁前缀来执行线程安全性的原因,所以我们说的更像是50-100个循环。更多如果同一个缓存行中的某些内容被两个之间的另一个内核修改。
至于无法测量它:
程序项目;
{$ APPTYPE CONSOLE}
使用
Windows,
SysUtils,
数学;
函数GetThreadTime:Int64;
var
CreationTime,ExitTime,KernelTime,UserTime:TFileTime;
begin
GetThreadTimes(GetCurrentThread,CreationTime,ExitTime,KernelTime,UserTime);
结果:= PInt64(@UserTime)^;
结束
函数ConstLength(const s:string):整数;
begin
结果:=长度(s);
结束
函数NoConstLength(s:string):整数;
begin
结果:=长度(s);
结束
var
s:string;
i:整数;
j:整数;
ConstTime,NoConstTime:Int64;
begin
try
//确保我们得到一个堆分配的字符串;
s:='abc';
s:= s +'123';
//确保在定时期间最小化线程上下文切换
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_TIME_CRITICAL);
j:= 0;
ConstTime:= GetThreadTime;
for i:= 0到100000000 do
Inc(j,ConstLength(s));
ConstTime:= GetThreadTime - ConstTime;
j:= 0;
NoConstTime:= GetThreadTime;
for i:= 0到100000000 do
Inc(j,NoConstLength(s));
NoConstTime:= GetThreadTime - NoConstTime;
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_NORMAL);
WriteLn('Const:',ConstTime);
WriteLn('NoConst:',NoConstTime);
WriteLn('Const is',(NoConstTime / ConstTime):2:2,'times faster。');
除了
在E:Exception do
Writeln(E.ClassName,':',E.Message);
结束
如果DebugHook<> 0然后
ReadLn;
结束。
在我的系统上生成此输出:
<$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $%
编辑:如果我们添加一些线程争用,它会变得更有趣:
程序项目;
{$ APPTYPE CONSOLE}
使用
Windows,
SysUtils,
类,
数学;
函数GetThreadTime:Int64;
var
CreationTime,ExitTime,KernelTime,UserTime:TFileTime;
begin
GetThreadTimes(GetCurrentThread,CreationTime,ExitTime,KernelTime,UserTime);
结果:= PInt64(@UserTime)^;
结束
函数ConstLength(const s:string):整数;
begin
结果:=长度(s);
结束
函数NoConstLength(s:string):整数;
begin
结果:=长度(s);
结束
函数LockedAdd(var Target:Integer; Value:Integer):Integer;寄存器;
asm
mov ecx,eax
mov eax,edx
lock xadd [ecx],eax
添加eax,edx
end;
var
x:整数;
s:string;
ConstTime,NoConstTime:Integer;
StartEvent:THandle;
ActiveCount:Integer;
begin
try
//确保我们得到一个堆分配的字符串;
s:='abc';
s:= s +'123';
ConstTime:= 0;
NoConstTime:= 0;
StartEvent:= CreateEvent(nil,True,False,'');
ActiveCount:= 0;
for x:= 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i:Integer;
j:Integer;
ThreadConstTime:Int64;
begin
//确保在定时期间最小化线程上下文切换
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent ,INFINITE);
j:= 0;
ThreadConstTime:= GetThreadTime;
for i:= 0到100000000 do
Inc(j,ConstLength(s));
ThreadConstTime:= GetThreadTime - ThreadConstTime;
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_NORMAL);
LockedAdd(ConstTime,ThreadConstTime);
InterlockedDecrement(ActiveCount);
结束)。
而ActiveCount< 3 do
睡眠(100);
SetEvent(StartEvent);
,而ActiveCount> 0 do
睡眠(100);
WriteLn('Const:',ConstTime);
ResetEvent(StartEvent);
for x:= 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i:Integer;
j:Integer;
ThreadNoConstTime: Int64;
begin
//确保在定时期间最小化线程上下文切换
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent,INFINITE);
j:= 0;
ThreadNoConstTime:= GetThreadTime;
for i:= 0到100000000 do
Inc(j,NoConstLength(s) ;
ThreadNoConstTime:= GetThreadTime - ThreadNoConstTime;
SetThreadPriority(GetCurrentThread,THREAD_PRIORITY_NORMAL);
LockedAdd(NoConstTime,ThreadNoConstTime);
InterlockedDecrement(ActiveCount) ;
end).Start;
,而ActiveCount< 3 do
睡眠(100);
SetEvent(StartEvent);
,而ActiveCount> 0 do
睡眠(100);
WriteLn('NoConst:',NoConstTime);
WriteLn('Const is',(NoConstTime / ConstTime):2:2,'times faster。');
除了
在E:Exception do
Writeln(E.ClassName,':',E.Message);
结束
如果DebugHook<> 0然后
ReadLn;
结束。
在6核心机器上,这给了我:
Const:19968128
NoConst:1313528420
Const是65.78倍快。
EDIT2:通过调用Pos替换长度的呼叫(我选择了最坏的情况,搜索字符串中不包含的东西):
function ConstLength(const s:string):Integer;
begin
结果:= Pos('x',s);
结束
函数NoConstLength(s:string):整数;
begin
结果:= Pos('x',s);
结束
结果:
Const:51792332
NoConst:1377644831
Const是26.60倍。
对于线程情况,以及:
Const:15912102
NoConst:44616286
Const的速度是2.80倍。
为非线程情况。
In Delphi you can speed up your code by passing parameters as const
, e.g.
function A(const AStr: string): integer;
//or
function B(AStr: string): integer;
Suppose both functions have the same code inside, the speed difference between them is negligible and I doubt it can even be measured with a cycle-counter like:
function RDTSC: comp;
var
TimeStamp: record case byte of
1: (Whole: comp);
2: (Lo, Hi: Longint);
end;
begin
asm
db $0F; db $31;
mov [TimeStamp.Lo], eax
mov [TimeStamp.Hi], edx
end;
Result := TimeStamp.Whole;
end;
The reason for this is that all the const
does in function A is to prevent the reference count of AStr
to be incremented.
But the increment only takes one cycle of one core of my multicore CPU, so...
Why should I bother with const
?
If there is no other reason for the function to contain an implicit try/finally, and the function itself is not doing much work, the use of const can result in a significant speedup (I once got one function that was using >10% of total runtime in a profiling run down to <2% just by adding a const in the right place).
Also, the reference counting takes much much more than one cycle because it has to be performed with the lock prefix for threadsafety reasons, so we are talking more like 50-100 cycles. More if something in the same cache line has been modified by another core in between.
As for not being able to measure it:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(@UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
var
s : string;
i : Integer;
j : Integer;
ConstTime, NoConstTime: Int64;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_TIME_CRITICAL);
j := 0;
ConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ConstTime := GetThreadTime - ConstTime;
j := 0;
NoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
NoConstTime := GetThreadTime - NoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
WriteLn('Const: ', ConstTime);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
Produces this output on my system:
Const: 6084039
NoConst: 36192232
Const is 5.95 times faster.
EDIT: it gets a bit more interesting if we add some thread contention:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Classes,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(@UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
function LockedAdd(var Target: Integer; Value: Integer): Integer; register;
asm
mov ecx, eax
mov eax, edx
lock xadd [ecx], eax
add eax, edx
end;
var
x : Integer;
s : string;
ConstTime, NoConstTime: Integer;
StartEvent: THandle;
ActiveCount: Integer;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
ConstTime := 0;
NoConstTime := 0;
StartEvent := CreateEvent(nil, True, False, '');
ActiveCount := 0;
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ThreadConstTime := GetThreadTime - ThreadConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(ConstTime, ThreadConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('Const: ', ConstTime);
ResetEvent(StartEvent);
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadNoConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadNoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
ThreadNoConstTime := GetThreadTime - ThreadNoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(NoConstTime, ThreadNoConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
On a 6 core machine, this gives me:
Const: 19968128
NoConst: 1313528420
Const is 65.78 times faster.
EDIT2: replacing the call to Length with a call to Pos (I picked the worst case, search for something not contained in the string):
function ConstLength(const s: string): Integer;
begin
Result := Pos('x', s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Pos('x', s);
end;
results in:
Const: 51792332
NoConst: 1377644831
Const is 26.60 times faster.
for the threaded case, and:
Const: 15912102
NoConst: 44616286
Const is 2.80 times faster.
for the non-threaded case.
这篇关于使用“const”教条还是理性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!