是否有更快的高分辨率计时器用于diy剖析 [英] Is there a faster hi resolution timer for diy profiling

查看:88
本文介绍了是否有更快的高分辨率计时器用于diy剖析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我试图使用秒表类来计算我的代码的某些部分,

但它太慢了,我尝试了QueryPerformanceFrequency( )

但这似乎同样慢,

如果我只是简单地在循环中调用这个函数需要21个刻度

with a报告的频率为3.5mhz

它大概是6us,在我的2ghz cpu上超过12k指令。

这否定了拥有高分辨率计时器的想法...


有没有办法更直接地访问计时器?

我知道cpu实际上有一个非常高分辨率的计时器,

it很高兴能够直接读取这个值,

也许我必须编写一个c ++库并在asm中完成

这就是假设我可以获得操作系统授予我访问该硬件地址的权限。


基本上我正在制作一个diy profiler,我有一个个人资料类

需要输入(enum p)功能开始和离开()

函数在一些感兴趣的函数结束时。


主要是它保持调用的数量,

和a列出调用函数计数。 br />
它还测量在函数中花费的总时间,并减去

子程序中花费的时间,以提供实际时间。 />

它的计数非常快,但是当我使用计时器时,它就像20

倍慢,

虽然我能活等待20分钟收集它所做的数据

意味着

短函数存在相当大的误差,


I添加了一个软糖从计时器中减去一个小提琴值,

每次被调用时增加21个滴答,

这可能会接近,但它远非理想。


我试图找出原因/在哪里花费75%的cpu时间在

mscorlib和mscorwks库中,amd codeanalyst确定其支出

大部分时间都在一个名为mscorwks的函数中:: gc_heap :: adjust_limit_clr


谢谢

Colin = ^。^ =

Hi,

Im trying to time some parts of my code using the Stopwatch class,
but it is so slow, I tried the QueryPerformanceFrequency()
but this seems to be just as slow,
if I just simply call this function in a loop it takes 21 ticks
with a reported frequency of 3.5mhz
its is about 6us wich is over 12k instructions on my 2ghz cpu.
this negates the idea of having a high res timer ...

Is there any way to access the timer more directly ?
I know the cpu actually has a very high resolution timer,
it would be nice to be able to just read this value directly,
maybe il have to write a c++ library and do it in asm
thats assuming I can get the OS to grant me acces to that hardware address.

basicaly im making a diy profiler, I have a profile class
wich needs Enter(enum p) funtion at the start and a Leave()
function at the end of some functions of interest.

primarily it keeps tarck of the number of calls made,
and a lists the calling functions counts too.
it also measures the total time spent in the function and the subtracts the
time spent in
subroutins to provide an actual time too.

It works pretty quick for counts but when I use the timer it is like 20
times slower,
wich although I can live with waiting 20 minutes to collect the data it does
mean
that there is a considerable error for short functions,

I have added a fudge wich subtracts a fiddle value from the timer,
wich is increased by 21 ticks every times its called,
this might be close but its far from ideal.

Im trying to find out why/where its spending 75% of the cpu time in
mscorlib and mscorwks libraries, the amd codeanalyst identifies its spending
most of its time in a function called mscorwks::gc_heap::adjust_limit_clr

thanks
Colin =^.^=

推荐答案

" colin" < co ********* @ ntworld.NOSPAM.com写信息

新闻:kr ************** @ newsfe4-win。 ntli.net ...
"colin" <co*********@ntworld.NOSPAM.comwrote in message
news:kr**************@newsfe4-win.ntli.net...




我试图使用秒表类来计算代码的某些部分,

但它太慢了,我尝试了QueryPerformanceFrequency()

但这似乎同样慢,

如果我只是简单地打电话这个函数在一个循环中需要21个刻度

,报告频率为3.5mhz

它大概是6us,我的2ghz cpu上的指令超过12k。

这否定了拥有高分辨率计时器的想法...


有没有办法更直接地访问计时器?

我知道cpu实际上有一个非常高分辨率的计时器,

能够直接读取这个值会很好,

也许我必须写一个c ++库并做它在asm

这假设我可以让操作系统授予我对该硬件的访问权限

地址。


basicaly我正在制作一个diy profiler,我有一个个人资料类

需要在开始时输入(enum p)功能和离开()

函数在一些感兴趣的函数结束时。


主要是它保持调用的数量,

和a列出调用函数也算数。

它还测量在函数中花费的总时间和减去

花在

子程序中的时间来提供实际时间也好。


它的计数非常快,但是当我使用计时器时,它就像20

一样慢,

wich虽然我可以等待20分钟来收集数据,但是b $ b确实意味着

短期功能存在相当大的错误,


我添加了一个软糖从计时器中减去一个小提琴值,

每次被调用时增加21个标记,

这可能是接近但是它的远非理想。


我试图找出原因/在哪里花费75%的cpu时间在

mscorlib和mscorwks库中,amd codeanalyst确定它的

花费

大部分时间都在一个名为mscorwks的函数中:: gc_heap :: adjust_limit_clr


谢谢

Colin = ^。^ =
Hi,

Im trying to time some parts of my code using the Stopwatch class,
but it is so slow, I tried the QueryPerformanceFrequency()
but this seems to be just as slow,
if I just simply call this function in a loop it takes 21 ticks
with a reported frequency of 3.5mhz
its is about 6us wich is over 12k instructions on my 2ghz cpu.
this negates the idea of having a high res timer ...

Is there any way to access the timer more directly ?
I know the cpu actually has a very high resolution timer,
it would be nice to be able to just read this value directly,
maybe il have to write a c++ library and do it in asm
thats assuming I can get the OS to grant me acces to that hardware
address.

basicaly im making a diy profiler, I have a profile class
wich needs Enter(enum p) funtion at the start and a Leave()
function at the end of some functions of interest.

primarily it keeps tarck of the number of calls made,
and a lists the calling functions counts too.
it also measures the total time spent in the function and the subtracts
the time spent in
subroutins to provide an actual time too.

It works pretty quick for counts but when I use the timer it is like 20
times slower,
wich although I can live with waiting 20 minutes to collect the data it
does mean
that there is a considerable error for short functions,

I have added a fudge wich subtracts a fiddle value from the timer,
wich is increased by 21 ticks every times its called,
this might be close but its far from ideal.

Im trying to find out why/where its spending 75% of the cpu time in
mscorlib and mscorwks libraries, the amd codeanalyst identifies its
spending
most of its time in a function called mscorwks::gc_heap::adjust_limit_clr

thanks
Colin =^.^=




请发贴你的代码,一定有问题,

QueryPerformanceFrequency()

可以'完成需要6微秒。

注意,不需要调用QueryPerformanceFrequency(),这个值由秒表类暴露出来。


Willy。



Please post your code, there must be something wrong with it,
QueryPerformanceFrequency()
can''t take 6μsec to complete.
Note that there is no need to call QueryPerformanceFrequency(), this value
exposed by the Stopwatch class.
Willy.


我刚从这里复制代码
http://msdn2.microsoft.com/en-us/lib...92(VS.80).aspx


和oops我的意思是QueryPerformanceCounter()调用花了这么长时间......


我只是有一个循环,它反复调用它100000次以测量它需要多长时间。


秒表也花了很长时间。

Colin = ^。^ =
I just copied the code from here
http://msdn2.microsoft.com/en-us/lib...92(VS.80).aspx

and oops I meant the QueryPerformanceCounter() call is taking so long ...

I just have a loop wich calls it repeatedly 100000 times to measure how long
it takes.

the Stopwatch also took just as long.

Colin =^.^=

>


请发布您的代码,必须有一些东西错了,

QueryPerformanceFrequency()

不能花费6微秒来完成。

注意,不需要调用QueryPerformanceFrequency( ),这个价值

由秒表课程公开。

Willy。
>
Please post your code, there must be something wrong with it,
QueryPerformanceFrequency()
can''t take 6μsec to complete.
Note that there is no need to call QueryPerformanceFrequency(), this value
exposed by the Stopwatch class.
Willy.



Willy Denoyette [MVP]" < wi ************* @ telenet.bewrote in message

news:ea ************** @ TK2MSFTNGP03。 phx.gbl ...
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:ea**************@TK2MSFTNGP03.phx.gbl...

" colin" < co ********* @ ntworld.NOSPAM.com写信息

新闻:kr ************** @ newsfe4-win。 ntli.net ...
"colin" <co*********@ntworld.NOSPAM.comwrote in message
news:kr**************@newsfe4-win.ntli.net...

>

我试图使用秒表类来计算代码的某些部分,但这似乎同样缓慢,如果我只是简单地在一个循环中调用这个函数需要21个刻度
报告的频率为3.5mhz
它约为6us,在我的2ghz cpu上超过12k指令。
这否定了拥有高分辨率计时器的想法......

有没有办法更直接地访问计时器?
我知道cpu实际上有一个非常高分辨率的计时器,
能够直接读取这个值会很好,
也许il我必须编写一个c ++库并在asm中执行它假设我可以让操作系统授予我对该硬件地址的访问权限。

基本上我正在制作一个diy剖析器,我有一个简介类
需要在开始时输入(enum p)功能,并在一些感兴趣的功能结束时使用Leave()
功能。

主要是保持调用的数量, />并且列出了调用函数的数量。
它还测量了在函数中花费的总时间和减去
子程序中花费的时间来提供实际时间。 />
它的计数非常快,但是当我使用计时器时,它会慢20或更慢,尽管我可以忍受等待20分钟来收集数据确实意味着短函数存在相当大的误差,

我添加了一个软糖从计时器中减去一个小提琴值,每个增加21个刻度它的召唤次数,
这可能很接近,但它远非理想。

我试图找出原因/在哪里花费75%的cpu时间在mscorlib和mscorwks图书馆,amd codeanalyst确定了它的大部分支出ime在一个名为mscorwks的函数:: gc_heap :: adjust_limit_clr

感谢
Colin = ^。^ =
>Hi,

Im trying to time some parts of my code using the Stopwatch class,
but it is so slow, I tried the QueryPerformanceFrequency()
but this seems to be just as slow,
if I just simply call this function in a loop it takes 21 ticks
with a reported frequency of 3.5mhz
its is about 6us wich is over 12k instructions on my 2ghz cpu.
this negates the idea of having a high res timer ...

Is there any way to access the timer more directly ?
I know the cpu actually has a very high resolution timer,
it would be nice to be able to just read this value directly,
maybe il have to write a c++ library and do it in asm
thats assuming I can get the OS to grant me acces to that hardware
address.

basicaly im making a diy profiler, I have a profile class
wich needs Enter(enum p) funtion at the start and a Leave()
function at the end of some functions of interest.

primarily it keeps tarck of the number of calls made,
and a lists the calling functions counts too.
it also measures the total time spent in the function and the subtracts
the time spent in
subroutins to provide an actual time too.

It works pretty quick for counts but when I use the timer it is like 20
times slower,
wich although I can live with waiting 20 minutes to collect the data it
does mean
that there is a considerable error for short functions,

I have added a fudge wich subtracts a fiddle value from the timer,
wich is increased by 21 ticks every times its called,
this might be close but its far from ideal.

Im trying to find out why/where its spending 75% of the cpu time in
mscorlib and mscorwks libraries, the amd codeanalyst identifies its
spending
most of its time in a function called mscorwks::gc_heap::adjust_limit_clr

thanks
Colin =^.^=




请发布您的代码,一定有问题,

QueryPerformanceFrequency()

不能花费6微秒来完成。

请注意,不需要调用QueryPerformanceFrequency(),这个值由秒表类暴露出来。

Willy。


Please post your code, there must be something wrong with it,
QueryPerformanceFrequency()
can''t take 6μsec to complete.
Note that there is no need to call QueryPerformanceFrequency(), this value
exposed by the Stopwatch class.
Willy.




那么,QueryPerformanceFrequency必须转换到内核,这是非常昂贵的,(6000条指令),所以这个功能可能需要+4
$ b $ 2Ghz机器上的bμsec。 QueryPerformanceCounter也是如此,所以你需要在分析时考虑这个开销。


Windows Vista和WS2008有一个新的API QueryThreadCycleTime返回

CPU周期数。

此API调用__rdtsc(); VC ++编译器的内部函数,你可以实现一个调用这个内在函数的小C函数,并使用PInvoke从C#调用这个函数。

Willy 。



Well, QueryPerformanceFrequency must transition into the kernel, which is
quite expensive, (6000 instructions), so this function may well take +4
μsec on a 2Ghz machine. The same is true for QueryPerformanceCounter, so you
need to account for this overhead when profiling.

Windows Vista and WS2008 have a new API QueryThreadCycleTime which returns
the CPU cycles count.
This API calls the __rdtsc(); VC++ compilers intrinsic function, you can
implement a small C function that calls this intrinsic and call this one
from C# using PInvoke.
Willy.


这篇关于是否有更快的高分辨率计时器用于diy剖析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆