人力资源计时器精度研究案例 [英] A HR timers precision study case

查看:157
本文介绍了人力资源计时器精度研究案例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于这个主题,我将更好地讨论HR计时器和实际精度问题.

With this topic I would better discuss HR timers and the real precision issue.

我研究了很多有关它们的文档,并且我确信它们是解决Linux内核模块内部延迟执行问题的最佳,最可靠的解决方案,其CPU成本更低,计时精度更高(例如,某些时间紧迫的驱动程序也使用它们,例如 https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/pwm/gpio-pwm.c?rev=35328 ).

I studied a lot of documentation about them and I got confident them are the best and most reliable solution to the problem of delaying execution inside linux kernel modules, with the lesser cost for the CPU, and the greater timing precision (e.g. some time critical drivers use them too, like this one https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/pwm/gpio-pwm.c?rev=35328 ).

也适合您吗?

这是我在该主题上见过的最全面,最详细的文档之一: https://www.landley.net/kdocs/ols/2006/ols2006v1-pages-333-346.pdf .

Here is one of the most comprehensive and detailed document I have ever seen on this topic: https://www.landley.net/kdocs/ols/2006/ols2006v1-pages-333-346.pdf .

HR计时器有望在jiffies分辨率下运行,但是不幸的是,在我的系统上,延迟低于6 ms的我没有得到预期的结果(我将在后面显示更多详细信息).

The HR timers promise to go under the jiffies resolution, but unfortunately on my system I did not get the expected results for delays lower than 6 ms (I will show later more details).

我的环境是:

  • Windows 10 PRO 64位/8Gb RAM/CPU Intel 4核
  • VMWare Player 12
  • 虚拟OS Linux Mint 18.1 64位

  • Windows 10 PRO 64 bit / 8Gb RAM / CPU Intel 4 Cores
  • VMWare Player 12
  • Virtualized OS Linux Mint 18.1 64 bit

内核配置

  • 版本:通用4.10.0-24
  • CONFIG_HIGH_RES_TIMERS = y
  • CONFIG_POSIX_TIMERS = y
  • CONFIG_NO_HZ_COMMON = y
  • CONFIG_NO_HZ_IDLE = y
  • CONFIG_NO_HZ = y
  • CONFIG_HZ_250 = y
  • CONFIG_HZ = 250

  • Version: 4.10.0-24-generic
  • CONFIG_HIGH_RES_TIMERS=y
  • CONFIG_POSIX_TIMERS=y
  • CONFIG_NO_HZ_COMMON=y
  • CONFIG_NO_HZ_IDLE=y
  • CONFIG_NO_HZ=y
  • CONFIG_HZ_250=y
  • CONFIG_HZ=250

/sys/devices/system/clocksource/clocksource0/available_clocksource => tsc hpet acpi_pm

/sys/devices/system/clocksource/clocksource0/available_clocksource => tsc hpet acpi_pm

/sys/devices/system/clocksource/clocksource0/current_clocksource => tsc

/sys/devices/system/clocksource/clocksource0/current_clocksource => tsc

要进行基准测试,我编写了一个Linux内核模块,该模块在url https上免费发布. ://bitbucket.org/DareDevilDev/hr-timers-tester/.在README文件中,有说明您可以自行编译和运行它.

To do a benchmark I wrote a linux kernel module that I freely published at the url https://bitbucket.org/DareDevilDev/hr-timers-tester/ . In the README file there are the instructions to compile and run it by yourself.

它执行以下一系列循环:

It executes a series of cycles as follow:

  • 10 us .. 90 us,增加10 us
  • 100美国.. 900美国,增加100美国
  • 1毫秒.. 9毫秒,增加1毫秒
  • 10毫秒.. 90毫秒,增加10毫秒
  • 100毫秒.. 900毫秒,增加100毫秒
  • 最后1 s

通过"ktime_get"函数测量计时,并将其存储在预分配的数组中,以提高性能,并避免在hr计时器回调中产生不必要的延迟.

The timings are measured by the "ktime_get" function and stored in a pre-allocated array, for faster performances, and to avoid unwanted delays inside the hr timer callback.

收集数据后,模块将打印出采样数据表.

After collecting data, the module prints out the samplings data table.

对于我的情况,相关数据为:

For my scenario relevant data are:

   10 uS =      41082 nS
   20 uS =      23955 nS
   30 uS =     478361 nS
   40 uS =      27341 nS
   50 uS =     806875 nS
   60 uS =     139721 nS
   70 uS =     963793 nS
   80 uS =      39475 nS
   90 uS =     175736 nS
  100 uS =    1096272 nS
  200 uS =      10099 nS
  300 uS =     967644 nS
  400 uS =     999006 nS
  500 uS =    1025254 nS
  600 uS =    1125488 nS
  700 uS =     982296 nS
  800 uS =    1011911 nS
  900 uS =     978652 nS
 1000 uS =    1985231 nS
 2000 uS =    1984367 nS
 3000 uS =    2068547 nS
 4000 uS =    5000319 nS
 5000 uS =    4144947 nS
 6000 uS =    6047991 nS <= First expected delay!
 7000 uS =    6835180 nS
 8000 uS =    8057504 nS
 9000 uS =    9218573 nS
10000 uS =   10435313 nS

...等等...

如您在上面的内核日志转储中所见,6 ms是第一个预期的延迟样本.

As you can see in the above kernel log dump, 6 ms is the first expected delay sample.

我在C.H.I.P.上重复了相同的测试嵌入式系统( https://getchip.com/pages/chip ),基于ARM的Raspberry板,运行频率为1 GHz,并配备了Ubuntu 14.04(内核4.4.13,HZ = 200).

I repeated the same test on my C.H.I.P. embedded system ( https://getchip.com/pages/chip ), an ARM based board Raspberry like, running at 1 GHz, and equipped with Ubuntu 14.04 (Kernel 4.4.13, HZ = 200).

在这种情况下,我得到了更好的结果:

In this case I got better results:

  30 =      44666 nS
  40 =      24125 nS
  50 =      49208 nS
  60 =      60208 nS
  70 =      70042 nS
  80 =      78334 nS
  90 =      89708 nS
 100 =     126083 nS
 200 =     184917 nS
 300 =     302917 nS <= First expected delay!
 400 =     395000 nS
 500 =     515333 nS
 600 =     591583 nS
 700 =     697458 nS
 800 =     800875 nS
 900 =     900125 nS
1000 =    1013375 nS

...等等...

在这种便宜的木板上,自300美元以来就取得了不错的成绩.

On that cheaper board good results come since 300 uS.

您怎么看?是否有更好的方法以平台无关的方式从HR计时器获得更高的精度? HR计时器是精确计时的错误解决方案(当我们必须编写硬件驱动程序时必须这样做)?

What is you opinion? Is there a better way to get more precision from HR timers in platform independent way? HR timers are the wrong solution to precise timing (mandatory when we have to write hardware drivers)?

每一个贡献将不胜感激.

Each contribution would be very appreciated.

谢谢!

推荐答案

问题已解决,这是虚拟化环境涉及的一个问题.

Problem solved, it was an issue involved by the virtualization environment.

在一台旧笔记本电脑(HP单核1.9GHz)上,自60 uS以来我有很好的延迟,而在一台新笔记本电脑(Dell四核)上,我得到了低于10 usS的良好延迟!

On an old laptop (HP Single Core 1.9GHz) I got good delays since 60 uS, and on a newer one (Dell Quad Core) I goot good delays below 10 uS!

这篇关于人力资源计时器精度研究案例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆