为什么CharInSet比Case语句更快? [英] Why is CharInSet faster than Case statement?

查看:237
本文介绍了为什么CharInSet比Case语句更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很困惑在CodeRage今天,Marco Cantu说CharInSet很慢,我应该尝试一个Case语句。我在我的解析器中这样做,然后用AQTime检查加速是什么。我发现案例陈述要慢得多。



4,894,539执行:


而不是CharInSet(P ^,['',#10,#13,#0])do inc(P);




< >时间为0.25秒。



但执行次数相同:


while True do

   case P ^ of

    '',#10,#13,#0:break; br>
     else inc(P);

   end;


为while True需要0.16秒,第一种情况为.80秒,其他情况下为.13秒,总计1.09秒,或超过4倍。



CharInSet语句的汇编代码是:


添加edi,$ 02

mov edx,$ 0064b290

movzx eax,[edi]

调用CharInSet

测试a1,a1

jz $ 00649f18 (回到add语句)


而案例逻辑就是这样:


movzx eax,[edi]

sub ax,$ 01

jb $ 00649ef0

sub ax,$ 09

jz $ 00649ef0

sub ax,$ 03

jz $ 00649ef0

添加edi,$ 02

jmp $ 00649ed6(返回到movzx语句)


案例逻辑看起来要使用非常有效的汇编器,而CharInSet语句实际上必须调用CharInSet函数,它在SysUtils中也很简单:


函数CharInSet(C:AnsiChar; const CharSet:TSysCharSet):Boolean;

begin

结果:= CharSet中的C;

end;




我认为这样做的唯一原因是因为Delphi 2009中不再允许[',#10,#13,#0]中的P ^



无限制我感到非常惊讶,仍然不信任我的结果。



AQTime测量错误,我在这个比较中缺少某些东西,还是CharInSet真正有价值的功能?






结论:



我想你得到了,巴里。感谢您抽出时间并做详细的例子。我在我的机器上测试了你的代码,并获得了.171,.066和.052秒(我猜我的桌面比你的笔记本电脑要快一些)。



在AQTime中,它给出:0.79,1.57和1.46秒,用于三个测试。在那里,您可以看到仪器的大量开销。但是真正令我惊奇的是,这种开销会将明显的最佳结果改变为CharInSet函数,这实际上是最差的。



所以Marcu是正确的,CharInSet更慢。但是,在Set方法中,通过使用AnsiChar(P ^)提取CharInSet所做的工作,无意中(或者也可能是故意的)给了我更好的方法。除了比案例方法小的速度优势之外,它也比使用案例更少的代码和更容易理解。



您还让我意识到使用AQTime(和其他仪器分析器)进行不正确的优化。知道这将有助于我的决定 Delphi的Profiler和内存分析工具这也是我的问题的另一个答案 AQTime如何做?当然,AQTime并不会改变代码的工具,所以它必须使用一些其他的魔法来做到这一点。



所以答案是AQTime显示结果导致不正确的结论。






跟进:我把这个问题与AQTime结果可能误导的指责留在一起。但是,为了公平起见,我应该引导你阅读这个问题:是有一个用于Delphi的快速GetToken例程?开始考虑AQTime给出了误导的结果,并得出结论,它不是。

AQTime是一个仪器分析器。仪表分析器通常不适合测量代码时间,特别是在像您这样的微型基准测试中,因为仪器的成本往往超过被测量的成本。另一方面,仪器仪表仪表能够精确地记录内存和其他资源使用情况。



定期检查CPU位置的采样分析器通常更适合测量代码时间。



无论如何,这里是另一个microbenchmark,它确实表明一个 case 语句比 CharInSet 。但是,请注意,set检查仍然可以与typecast一起使用以消除截断警告(实际上这是CharInSet存在的唯一原因):

  {$ apptype console} 

使用Windows,SysUtils;

const
SampleString ='foo bar baz blah de; blah de blah。';

程序P1;
var
cp:PChar;
begin
cp:= PChar(SampleString);
而不是CharInSet(cp ^,[#0,';','。'])do
Inc(cp);
结束

过程P2;
var
cp:PChar;
begin
cp:= PChar(SampleString);
while True do
case cp ^ of
'。',#0,';':
Break;
else
Inc(cp);
结束
结束

程序P3;
var
cp:PChar;
begin
cp:= PChar(SampleString);
while not(AnsiChar(cp ^)in [#0,';','。'])do
Inc(cp);
结束

程序时间(const标题:string; Proc:TProc);
var
i:整数;
开始,完成,频率:Int64;
begin
QueryPerformanceCounter(start);
for i:= 1 to 1000000 do
Proc;
QueryPerformanceCounter(finish);
QueryPerformanceFrequency(freq);
Writeln(Format('%20s:%.3f seconds',[Title,(finish-start)/ freq]));
结束

begin
时间('CharInSet',P1);
时间('case stmt',P2);
时间('set test',P3);
结束。

我的笔记本电脑上的输出是:

  CharInSet:0.261秒
案例stmt:0.077秒
设置测试:0.060秒


I'm perplexed. At CodeRage today, Marco Cantu said that CharInSet was slow and I should try a Case statement instead. I did so in my parser and then checked with AQTime what the speedup was. I found the Case statement to be much slower.

4,894,539 executions of:

while not CharInSet (P^, [' ', #10,#13, #0]) do inc(P);

was timed at 0.25 seconds.

But the same number of executions of:

while True do
  case P^ of
    ' ', #10, #13, #0: break;
    else inc(P);
  end;

takes .16 seconds for the "while True", .80 seconds for the first case, and .13 seconds for the else case, totaling 1.09 seconds, or over 4 times as long.

The assembler code for the CharInSet statement is:

add edi,$02
mov edx,$0064b290
movzx eax,[edi]
call CharInSet
test a1,a1
jz $00649f18 (back to the add statement)

whereas the case logic is simply this:

movzx eax,[edi]
sub ax,$01
jb $00649ef0
sub ax,$09
jz $00649ef0
sub ax,$03
jz $00649ef0
add edi,$02
jmp $00649ed6 (back to the movzx statement)

The case logic looks to me to be using very efficient assembler, whereas the CharInSet statement actually has to make a call to the CharInSet function, which is in SysUtils and is also simple, being:

function CharInSet(C: AnsiChar; const CharSet: TSysCharSet): Boolean;
begin
Result := C in CharSet;
end;

I think the only reason why this is done is because P^ in [' ', #10, #13, #0] is no longer allowed in Delphi 2009 so the call does the conversion of types to allow it.

None-the-less I am very surprised by this and still don't trust my result.

Is AQTime measuring something wrong, am I missing something in this comparison, or is CharInSet truly an efficient function worth using?


Conclusion:

I think you got it, Barry. Thank you for taking the time and doing the detailed example. I tested your code on my machine and got .171, .066 and .052 seconds (I guess my desktop is a bit faster than your laptop).

Testing that code in AQTime, it gives: 0.79, 1.57 and 1.46 seconds for the three tests. There you can see the large overhead from the instrumentation. But what really surprises me is that this overhead changes the apparent "best" result to be the CharInSet function which is actually the worst.

So Marcu is correct and CharInSet is slower. But you've inadvertently (or maybe on purpose) given me a better way by pulling out what CharInSet is doing with the AnsiChar(P^) in Set method. Other than the minor speed advantage over the case method, it is also less code and more understandable than using the cases.

You've also made me aware of the possibility of incorrect optimization using AQTime (and other instrumenting profilers). Knowing this will help my decision re Profiler and Memory Analysis Tools for Delphi and it also is another answer to my question How Does AQTime Do It?. Of course, AQTime doesn't change the code when it instruments, so it must use some other magic to do it.

So the answer is that AQTime is showing results that lead to the incorrect conclusion.


Followup: I left this question with the "accusation" that AQTime results may be misleading. But to be fair, I should direct you to read through this question: Is There A Fast GetToken Routine For Delphi? which started off thinking AQTime gave misleading results, and concludes that it does not.

解决方案

AQTime is an instrumenting profiler. Instrumenting profilers often aren't suitable for measuring code time, particularly in microbenchmarks like yours, because the cost of the instrumentation often outweighs the cost of the thing being measured. Instrumenting profilers, on the other hand, excel at profiling memory and other resource usage.

Sampling profilers, which periodically check the location of the CPU, are usually better for measuring code time.

In any case, here's another microbenchmark which indeed shows that a case statement is faster than CharInSet. However, note that the set check can still be used with a typecast to eliminate the truncation warning (actually this is the only reason that CharInSet exists):

{$apptype console}

uses Windows, SysUtils;

const
  SampleString = 'foo bar baz blah de;blah de blah.';

procedure P1;
var
  cp: PChar;
begin
  cp := PChar(SampleString);
  while not CharInSet(cp^, [#0, ';', '.']) do
    Inc(cp);
end;

procedure P2;
var
  cp: PChar;
begin
  cp := PChar(SampleString);
  while True do
    case cp^ of
      '.', #0, ';':
        Break;
    else
      Inc(cp);
    end;
end;

procedure P3;
var
  cp: PChar;
begin
  cp := PChar(SampleString);
  while not (AnsiChar(cp^) in [#0, ';', '.']) do
    Inc(cp);
end;

procedure Time(const Title: string; Proc: TProc);
var
  i: Integer;
  start, finish, freq: Int64;
begin
  QueryPerformanceCounter(start);
  for i := 1 to 1000000 do
    Proc;
  QueryPerformanceCounter(finish);
  QueryPerformanceFrequency(freq);
  Writeln(Format('%20s: %.3f seconds', [Title, (finish - start) / freq]));
end;

begin
  Time('CharInSet', P1);
  Time('case stmt', P2);
  Time('set test', P3);
end.

Its output on my laptop here is:

CharInSet: 0.261 seconds
case stmt: 0.077 seconds
 set test: 0.060 seconds

这篇关于为什么CharInSet比Case语句更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆