测量Linux内核中函数的执行时间 [英] Measuring execution time of a function inside linux kernel

查看:1112
本文介绍了测量Linux内核中函数的执行时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Linux安全模块挂钩来向recv()系统调用添加一些自定义功能.与原始recv()相比,我想测量此功能的开销.我编写了一个简单的tcp服务器,可以在没有模块的情况下运行.此tcp服务器多次调用recv()函数'N'.它可以估算出每次收看所花费的时间,例如:

I am using Linux Security Module hooks to add some custom functionality to recv() system call. I want to measure the overhead of this functionality as compared to the pristine recv(). I have written a simple tcp server that I run with and without my module. This tcp server calls a recv() function 'N' number of times. It measures time taken for each recv with something like:

clock_gettime(before);
recv()
clock_gettime(after);
global_time += after - before.

最后,我用"global_time/N"打印单个recv()的平均时间.让我们将此时间称为"user_space_avg_recv"时间.

In the end, I print the average time for a single recv() with "global_time/N". Lets call this time as "user_space_avg_recv" time.

在模块内部,我想放置时间测量函数以计算钩子的确切执行时间.我尝试了3种方法.

Inside my module, I want to place time measurement functions to calculate exact execution time of my hook. I tried 3 methods.

  1. 我用吉夫饼干的方法如下:

  1. I used jiffies as follows:

sj = jiffies;
my_hook();
ej = jiffies;
current->total_oh = ej - sj;

但是我看到sj和ej值之间没有区别.因此total_oh保持不变.

But I see that there is no difference between sj and ej values. Hence total_oh is unchanged.

我使用了current_kernel_time(),因为我认为它返回的时间以纳秒为单位.但是,再次,前后时间没有区别.

I used current_kernel_time() since I thought it returns the time in nanoseconds. However, once again, there was no difference in before and after time.

我使用了get_cycles.当进程退出时,我将打印总周期.但是,当我将总周期值转换为毫秒时,结果比 "user_space_avg_recv"值.这没有意义,因为内核内部的测量值始终小于从用户空间测量的时间值.这可能意味着我或者没有使用正确的API进行测量,或者在将值从周期转换为毫秒时出错了.

I used get_cycles. I print the total cycles when the process exits. However, when I convert that total cycles values to milliseconds, it comes out be much greater than "user_space_avg_recv" value. This does not make sense as measured value inside kernel always be lesser than the time value measured from user space. This could mean that I am either not measuring using correct API or I am making a mistake while converting value from cycles to milliseconds.

我基本上是使用以下公式将周期转换为毫秒:

I am basically using following formula to convert cycles to milliseconds:

avg overhead of my hook in milliseconds = 
             (((cycles / 2.99) / 10^6) / N)

2.99,因为我的时钟频率是2.99Ghz

2.99 because my clock frequency is 2.99Ghz

一些要点:

  • 我的用户空间程序使用设置的相似性绑定到单个内核.

  • My user space program is tied to a single core using set affinity.

我正在使用内核2.6.22.14

I am using kernel 2.6.22.14

要阻止内核在挂机内时切换上下文,请使用preempt_disable()和preempt_enable().因此,它将不计算其他内核线程的执行时间.即使那样,由于我的钩子使用了一些I/O,所以我的线程可能会自动释放控件,或者可能会发生一些中断,这可能会增加总周期数.

To stop kernel from switching contexts while inside my hook, I use preempt_disable() and preempt_enable(). Thus it will not count the execution times of other kernel threads. Even then, since my hook use some I/O, my thread might release the control voluntarily or some interrupt might occur that might increase the total cycles count.

问题: 如何在内核中准确地测量函数执行时间?

Question: How can I measure function execution times accurately inside the kernel?

推荐答案

您可以使用

You can use the function tracer API to get a trace of all function calls and returns, with high-precision timestamps. This includes interrupt events and context switches. You can then analyze the resulting trace in userspace to get an accurate idea of how long your function takes to run.

如果不能使用函数跟踪器API,则可以调用 do_gettimeofday() 调用以获取微秒级的时间戳,或

If you can't use the function tracer API, you could call the do_gettimeofday() call to get a microsecond-resolution timestamp, or getnstimeofday() for nanosecond resolution. These are the same functions the userspace gettimeofday() call uses internally. Of course, for very fast functions this may not be sufficient accuracy; any faster accuracy than that and you'll probably need to dig into the timer code to see how it implements cycle conversions. Note also that just because they have high resolution does not mean they have that much accuracy - but they should be useful for benchmarking purposes.

请注意,任何形式的跟踪都会导致额外的延迟-do_gettimeofday()需要大量原子比较和交换操作,而ftrace将记录代码放在每个单个函数的前同步和后同步.在解释结果时,您应该考虑到这一点.

Note that any form of tracing will result in additional latency - do_gettimeofday() requires a number of atomic compare-and-swap operations, and ftrace puts logging code on every single function pre- and post-amble. You should take this into consideration when interpreting results.

这篇关于测量Linux内核中函数的执行时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆